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NOVEL POLYNUCLEOTIDES 

The present application claims benefit of Japanese Patent Application Nos. 
Hei. 11-377484 (filed December 16, 1999), 2000-159162 (filed April 7, 2000) 
and 2000-280988 (filed August 3 5 2000), the entire contents of each of which 
is incorporated herein by reference. 

The contents of the attached CD-R compact discs are incorporated herein by 
reference in their entirety. The attached discs contain an identical 
copy of a file "SEQ2.TXT" which were created on the discs on December 13, 2000, 
and are each 25,891 KB. 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

The present invention relates to novel 
polynucleotides derived from microorganisms belonging to 
coryneform bacteria and fragments thereof, polypeptides 
encoded by the polynucleotides and fragments thereof, 
polynucleotide arrays comprising the polynucleotides and 
fragments thereof, computer readable recording media in 
which the nucleotide sequences of the polynucleotide and 
fragments thereof have been recorded, and use of them as 
well as a method of using the polynucleotide and/or 
polypeptide sequence information to make comparisons . 

2. Brief Description of the Background Art 

Coryneform bacteria are used in producing various 
useful substances, such as amino acids, nucleic acids f 
vitamins, saccharides (for example, ribulose) , organic 
acids (for example, pyruvic acid) , and analogues of the 
above-described substances (for example, N-acetylamino 
acids) and are very useful microorganisms industrially. 
Many mutants thereof are known. 

For example, CoryneJbacterium glutamicuzn is a Gram- 
positive bacterium identified as a glutamic acid-producing 
bacterium, and many amino acids are produced by mutants 
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thereof. For example , 1,000,000 ton/year of Ii-glutamic 
acid which is useful as a seasoning for umami (delicious 
taste), 250,000 ton/year of L-lysine which is a valuable 
additive for livestock feeds and the like, and several 
hundred ton/year or more of other amino acids , such as L- 
arginine, L-proline, L-glutamine , L- tryptophan, and the 
like, have been produced in the world (Nikkei Bio Yearbook 
99, published by Nikkei BP (1998)). 

The production of amino acids by Corynebacterlvm 
g-lutamlcvun is mainly carried out by its mutants (metabolic 
mutants) which have a mutated metabolic pathway and 
regulatory systems. In general, an organism is provided 
with various metabolic regulatory systems so as not to 
produce more amino acids than it needs. In the 

biosynthesis of L-lysine, for example, a microorganism 
belonging to the genus Corynebacberlum is under such 
regulation as preventing the excessive production by 
concerted inhibition by lysine and threonine against the 
activity of a biosynthesis enzyme common to lysine, 
threonine and methionine, i.e., an aspartokinase, (J\ 
Blochem. , 65: 849-859 (1969)). The biosynthesis of 

arginine is controlled by repressing the expression of its 
biosynthesis gene by arginine so as not to biosynthesize an 
excessive amount of arginine {Microbiology, 142: 99-108 
(1996) ) . It is considered that these metabolic regulatory 
mechanisms are deregulated in amino acid-producing mutants. 
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Similarly, the metabolic regulation is deregulated in 
mutants producing nucleic acids , vitamins , saccharides, 
organic acids and analogues of the above -described 
substances so as to improve the productivity of the 
objective product. 

However, accumulation of basic genetic, biochemical 
and molecular biological data on coryneform bacteria is 
insufficient in comparison with Escherichia, coll, Bacillus 
subtilis, and the like. Also, few findings have been 
obtained on mutated genes in amino acid-producing mutants . 
Thus, there are various mechanisms, which are still unknown, 
of regulating the growth and metabolism of these 
microorganisms . 

A chromosomal physical map of CorynebsLcterium 
glutamicum ATCC 13032 is reported and it is known that its 
genome size is about 3,100 kb (Mbl. Gen. Genet., 252: 255- 
265 (1996) } . Calculating on the basis of the usual gene 
density of bacteria, it is presumed that about 3,000 genes 
are present in this genome of about 3,100 kb. However, 
only about 100 genes mainly concerning amino acid 
biosynthesis genes are known in Corynebacterium glut ami cum, 
and the nucleotide sequences of most genes have not been 
clarified hitherto. 

In recent years, the full nucleotide sequence of 
the genomes of several microorganisms, such as Escherichia 
coll, Mycobacterium tuberculosis, yeast, and the like, have 
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been determined (Science, 277: 1453-62 (1997) ; Nature, 
393: 537-544 (1998); Nature, 387: 5-105 (1997)). Based on 
the thus determined full nucleotide sequences , assumption 
of gene regions and prediction of their function by 
comparison with the nucleotide sequences of known genes 
have been carried out. Thus, the functions of a great 
number of genes have been presumed, without genetic, 
biochemical or molecular biological experiments. 

In recent years , moreover , techniques for 
monitoring expression levels of a great number of genes 
simultaneously or detecting mutations, using DMA chips, dna 
arrays or the like in which a partial nucleic acid fragment 
of a gene or a partial nucleic acid fragment in genomic DNA 
other than a gene is fixed to a solid support, have been 
developed. The techniques contribute to the analysis of 
microorganisms, such as yeasts, Mycobacterium tuberculosis, 
Mycobacterium bcvls used in BCG vaccines, and the like 
{Science, 278: 680-686 (1997) ; Proc. Natl. Acad. Scl . USA, 
96: 12833-38 (1999); Science, 284: 1520-23 (1999)). 

SUMMARY OF THE INVENTION 
An object of the present invention is to provide a 
polynucleotide and a polypeptide derived from a 
microorganism of coryneform bacteria which are industrially 
useful, sequence information of the polynucleotide and the 
polypeptide, a method for analyzing the microorganism, an 
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apparatus and a system for use in the analysis , and a 
method for breeding the microorganism. 

The present invention provides a polynucleotide and 
an oligonucleotide derived from a microorganism belonging 
to coryneform bacteria , oligonucleotide arrays to which the 
polynucleotides and the oligonucleotides are fixed, a 
polypeptide encoded by the polynucleotide, an antibody 
which recognizes the polypeptide, polypeptide arrays to 
which the polypeptides or the antibodies are fixed, a 
computer readable recording medium in which the nucleotide 
sequences of the polynucleotide and the oligonucleotide and 
the amino acid sequence of the polypeptide have been 
recorded, and a system based on the computer using the 
recording medium as well as a method of using the 
polynucleotide and/or polypeptide sequence information to 
make comparisons. 

BRIEF DESCRIPTION OF THE DRAWING 
Fig. 1 is a map showing the positions of typical 
genes on the genome of CoryneJbacterium glutamlcvm ATCC 
13032. 

Fig. 2 is electrophoresis showing the results of 
proteome analyses using proteins derived from (A) 
Corynebstcterlvon glut anal cum ATCC 13032, (B) FERM BP-7134, 
and <C) FERM BP-158 . 
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Fig. 3 is a flow chart of an example of a system 
using the computer readable media according to the present 
invention . 

Fig. 4 is a flow chart of an example of a system 
using the computer readable media according to the present 
invention . 

DETAILED DESCRIPTION OF THE INVENTION 
This application is based on Japanese applications 
No. Hei. 11-377484 filed on December 16, 1999, No. 2000- 
159162 filed on April 7, 2000 and No, 2000-280988 filed on 
August 3, 2000, the entire contents of which are 
incorporated hereinto by reference. 

From the viewpoint that the determination of the 
full nucleotide sequence of Corynebacterium glutamicum 
would make it possible to specify gene regions which had 
not been previously identified, to determine the function 
of an unknown gene derived from the microorganism through 
comparison with nucleotide sequences of known genes and 
amino acid sequences of known genes , and to obtain a useful 
mutant based on the presumption of the metabolic regulatory 
mechanism of a useful product by the microorganism, the 
inventors conducted intensive studies and, as a result, 
found that the complete genome sequence of CoryneJbacterium 
glutamicum can be determined by applying the whole genome 
shotgun method. 
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Specifically, the present invention relates to the 
following (1) to (65) : 

(1) A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from 
a mutant of a coryneform bacterium, 

(B) measuring an expression amount of a gene derived 
from a coryneform bacterium, 

(C) analyzing an expression profile of a gene derived 
from a coryneform bacterium, 

(D) analyzing expression patterns of genes derived from 
a coryneform bacterium, or 

(E) identifying a gene homologous to a gene derived 
from a coryneform bacterium, 

said method comprising: 

(a) producing a polynucleotide array by adhering to a 
solid support at least two polynucleotides selected from 
the group consisting of first polynucleotides comprising 
the nucleotide sequence represented by any one of SEQ ID 
N0S:1 to 3501, second polynucleotides which hybridize with 
the first polynucleotides under stringent conditions, and 
third polynucleotides comprising a sequence of 10 to 200 
continuous bases of the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least 
one of a labeled polynucleotide derived from a coryneform 
bacterium, a labeled polynucleotide derived from a mutant 
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of the coryneform bacterium or a labeled polynucleotide to 
be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 

As used herein, for example, the at least two 
polynucleotides can be at least two of the first 
polynucleotides , at least two of the second polynucleotides , 
at least two of the third polynucleotides, or at least two 
of the first, second and third polynucleotides, 

(2) The method according to (1) , wherein the coryneform 
bacterium is a microorganism belonging to the genus 
Corynebacterittm , the genus Bre vlba c fc era um , or the genus 
Ml crabs, c fceri. um . 

(3) The method according to (2) , wherein the 
microorganism belonging to the genus Corynebacterlum is 
selected from the group consisting of Corynebacterivm 
g-lutamlcum , Corynebacterlum acetoacldophllum , 
Corynebacterlum acetoglutamicum, Corynebacterlum callunae , 
Coryneba. c t eri um her culls , Corynebacterlum lllium f 
Corynebacterivm melassecola f Coryneba c t era um 
thennoamlnogenes f and Coryneba. c teri um ammonla.g&nes . 

(4) The method according to (1) , wherein the 
polynucleotide derived from a coryneform bacterium, the 
polynucelotide derived from a mutant of the coryneform 
bacterium or the polynucleotide to be examined is a gene 
relating to the biosynthesis of at least one compound 
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selected from an amino acid, a nucleic acid, a vitamin, a 
saccharide, an organic acid, and analogues thereof. 

(5) The method according to (1) , wherein the 
polynucleotide to be examined is derived from Escherichia, 
coll * 

(6) A polynucleotide array, comprising: 

at least two polynucleotides selected from the 
group consisting of first polynucleotides comprising the 
nucleotide sequence represented by any one of SEQ ID NOS:l 
to 3501, second polynucleotides which hybridize with the 
first polynucleotides under stringent conditions , and third 
polynucleotides comprising 10 to 200 continuous bases of 
the first or second polynucleotides, and 

a solid support adhered thereto* 

As used herein, for example, the at least two 
polynucleotides can be at least two of the first 
polynucleotides, at least two of the second polynucleotides, 
at least two of the third polynucleotides, or at least two 
of the first, second and third polynucleotides, 

(7) A polynucleotide comprising the nucleotide sequence 
represented by SEQ ID NO:l or a polynucleotide having a 
homology of at least 80% with the polynucleotide. 

(8) A polynucleotide comprising any one of the 
nucleotide sequences represented by SEQ ID NOS:2 to 3431, 
or a polynucleotide which hybridizes with the 
polynucleotide under stringent conditions. 
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(9) A polynucleotide encoding a polypeptide having any 
one of the amino acid se<suences represented by SEQ ID 
NOS:3502 to 6931 , or a polynucleotide which hybridises 
therewith under stringent conditions. 

(10) A polynucleotide which is present in the S' 

upstream or 3 1 downstream of a polynucleotide comprising 

the nucleotide sequence of any one of SEQ ID NOS:2 to 3431 
in a whole polynucleotide comprising the nucleotide 
sequence represented by SEQ ID NO:l, and has an activity of 
regulating an expression of the polynucleotide. 

(11) A polynucleotide comprising 10 to 200 continuous 
bases in the nucleotide sequence of the polynucleotide of 
any one of (7) to (10), or a polynucleotide comprising a 
nucleotide sequence complementary to the polynucleotide 
comprising 10 to 200 continuous based. 

<12) A recombinant DNA comprising the polynucleotide of 
any one of (8) to (11) . * 

<13) A transformant comprising the polynucleotide of any 
one of (8) to (11) or the recombinant DKA of (12) . 
(14) A method for producing a polypeptide, comprising: 

culturing the transformant of (13) in a medium to 
produce and accumulate a polypeptide encoded by the 
polynucleotide of (8) or (9) in the medium, and 

recovering the polypeptide from the medium. 
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(15) A method for producing at least one of an amino 
acid, a nucleic acid, a vitamin, a saccharide, an organic 
acid, and analogues thereof, comprising: 

culturing the transf ormant of (13) in a medium to 
produce and accumulate at least one of an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof in the medium, and 

recovering the at least one of the amino acid, the 
nucleic acid, the vitamin, the saccharide, the organic acid, 
and analogues thereof from the medium. 

(16) A polypeptide encoded by a polynucleotide 
comprising the nucleotide sequence selected from SEQ ID 
NOS:2 to 3431. 

(17) A polypeptide comprising the amino acid sequence 
selected from SEQ ID NOS:3502 to 6931. 

(18) The polypeptide according to (16) or (17) , wherein 
at least one amino acid is deleted, replaced, inserted or 
added, said polypeptides having an activity which is 
substantially the same as that of the polypeptide without 
said at least one amino acid deletion, replacement, 
insertion or addition. 

(19) A polypeptide comprising an amino acid sequence 
having a homology of at least 60% with the amino acid 
sequence of the polypeptide of (16) or (17) , and having an 
activity which is substantially the same as that of the 
polypeptide . 



- 11 - 



(20) An antibody which recognizes the polypeptide of any 
one of (16) to (19) m 

(21) A polypeptide array, comprising: 

at least one polypeptide or partial fragment 
polypeptide selected from the polypeptides of (16) to (19) 
and partial fragment polypeptides of the polypeptides, and 

a solid support adhered thereto. 

(22) A polypeptide array, comprising: 

at least one antibody which recognizes a 
polypeptide or partial fragment polypeptide selected from 
the polypeptides of (16) to (19) and partial fragment 
polypeptides of the polypeptides , and 

a solid support adhered thereto. 

(23) A system based on a computer for identifying a 
target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one 
nucleotide sequence information selected from SEQ ID N0S:1 
to 3501, and target sequence or target structure motif 
information ; 

(ii) a data storage device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the at least one 
nucleotide sequence information selected from SEQ ID NOS : 1 
to 3501 with the target sequence or target structure motif 
information, recorded by the data storage device for 
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screening and analyzing nucleotide sequence information 
which is coincident with or analogous to the target 
sequence or target structure motif information; and 
(iv) an output device that shows a screening or 
analyzing result obtained by the comparator. 

(24) A method based on a computer for identifying a 
target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence 
information selected from SEQ ID NOS:l to 3501, target 
sequence information or target structure motif information 
into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence 
information selected from SEQ ID N0S:1 to 3501 with the 
target sequence or target structure motif information; and 

(iv) screening and analyzing nucleotide sequence 
information which is coincident with or analogous to the 
target sequence or target structure motif information. 

(25) A system based on a computer for identifying a 
target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one amino 

acid sequence information selected from SEQ ID NOS:3502 to 
7001, and target sequence or target structure motif 
information ; 
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(ii) a data storage device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the at least one amino 
acid sequence information selected from SEQ ID NOS:3502 to 
7001 with the target sequence or target structure motif 
information, recorded by the data storage device for 
screening and analyzing amino acid sequence information 
which is coincident with or analogous to the target 
sequence or target structure motif information; and 

(iv) an output device that shows a screening or 
analyzing result obtained by the comparator. 

(26) A method based on a computer for identifying a 

target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) inputting at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001, and 
target sequence information or target structure motif 
information into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001 with the 
target sequence or target structure motif information; and 

(iv) screening and analyzing amino acid sequence 
information which is coincident with or analogous to the 
target sequence or target structure motif information. 
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(27) A system based on a computer for determining a 
function of a polypeptide encoded i>y a- polynucleotide 
having a target nucleotide sequence derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one 
nucleotide sequence information selected from SEQ ID NOS : 2 
to 3501 , function information of a polypeptide encoded by 
the nucleotide sequence, and target nucleotide sequence 
information ; 

(ii) a data storage device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the at least one 

nucleotide sequence information selected from SEQ ID NOS: 2 

to 3501 with the target nucleotide sequence information, 

and determining a function of a polypeptide encoded by a 
polynucleotide having the target nucleotide sequence which 
is coincident with or analogous to the polynucleotide 
having at least one nucleotide sequence selected from SEQ 
ID NOS: 2 to 3501; and 

(iv) an output devices that shows a function obtained by 
the comparator, 

(28) A method based on a computer for determining a 
function of a polypeptide encoded by a polypeptide encoded 
by a polynucleotide having a target nucleotide sequence 
derived from a coryneform bacterium, comprising the 

following: 
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<i) inputting at least one nucleotide sequence 

information selected from SEQ ID N0S:2 to 3501, function 
information of a polypeptide encoded by the nucleotide 

sequence, and target nucleotide sequence information; 

(ii) at least temporarily snoring said information; 

(iii) comparing the at least one nucleotide sequence 
information selected from SEQ ID NOS:2 to 3S01 with the 
target nucleotide sequence information; and 

(iv) determining a function of a polypeptide encoded by 
a polynucleotide having the target nucleotide sequence 
which is coincident with or analogous to the polynucleotide 
having at least one nucleotide sequence selected from SEQ 
ID N0S:2 to 3501. 

(29) A system based on a computer for determining a 

function of a polypepticfe having a target amino acid 
sequence derived from a coryneform bacterium, comprising 
the following: 

(i) a user input device that inputs at least one amino 
acid sequence information selected from SE£ ID WOS:3502 to 
7001 , function information based on the amino acid sequence, 
and target amino acid sequence information; 

(ii) a data storing device for at least temporarily 
storing the input information ; 

(iii) a comparator that compares the at least one amino 
acid sequence information selected from SSQ ID $JOS:3502 to 
7001 with the target amino acid sequence information for 
determining a function of a polypeptide having the target 
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amino acid sequence which is coincident with or analogous 
to the polypeptide having at least one amino acid sequence 
selected from SEQ ID NOS.3502 to 7001; and 

(iv) an output device that shows a function obtained by 
the comparator. 

(30) A method based on a computer for determining a 
function of a polypeptide having a target amino acid 
sequence derived from a coryneform bacterium, comprising 
the following: 

(i) inputting at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001, function 
information based on the amino acid sequence, and target 
amino acid sequence information ; 

(ii) at least temporarily storing said inf ormation; 

(iii) comparing the cjt_ least one amino acid sequence 
information selected from SEQ ID NOS:3502 to /001 with the 
target amino acid sequence information; and 

(iv) determining a function of a polypeptide having the 
target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid 
sequence selected from S£Q ID NOS:3502 to 7001. 

(31) The system according to any one of (23) / (25) , (27) 
and (29) , wherein a coryneform bacterium is a microorganism 
of the genus Cory^eJbacterzum , the genus Brev±±>acterium, or 
the genus Mlcrobacterlvm 
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(32) The method according to any one of (24) , <26) , (28) 
and. (30) r wherein a corynef orra bacterium is a microorganism 
of the genus CoirynebActerxvm, the genus Brev-xbacterium, or 
the genus Wxarobacteriuiu. 

(33) The system according to {3D , wherein the 
microorganism belonging . to the genus Corynebacterium is 
selected from the group consisting of Corynebacterium 
g-lu tamicum , Coryru ebacteri ma ace toacxdophilum , 
Corynebacteriuni acetoglutamicum, CorjneJba cterl urn callunae, 
Cozrynebacterium herculxs, Coryziebacterium lilium, 
Coiryn&ba-cterlum melassecola , Corynebacterivan 
thermozmlnog&nes , and Coiynebacterium mjuc xi± ^ gotiie s . 

(34) The mtrth-O : ^^^or^ding to .^2) , ^r.erein the 
microorganism belonging to t^.e genu Coryx^nz.-cfcbezlim ii 
selected from tn^- c^r-v* consisting of .- o jr yn e^>a r t eri uzn 
glutamicum, Cory^e-oa c terimn acetoacidophilum, 
Corynebacterium acetog-lufcamicum, CozynebziG't&rlvm eallunae, 
Corjmebacterium hercralis, Coryneba ct erinm lilium, 
Corynebac tedium ^elassecola , Coryneba cterium 
thermoam f nogenes , and Coryi:ebacteriiun affixaoniagen«s. 

(35) A recording medium or storage device which is 
readable by a compute? 1 i? which at least one nucleotide 
sequence information selected from SSQ ID NOS ; 1 to 3501 or 
function informal: ion based on the nucleotide sequence is 
recorded, and is usable iu the system of (2 s) or (27) or 
the method of (24) or (2B^ 



(36) A recording ii *an» or storage device which is 
readable by a ^q^ j c^l i.i wtuch at least o:ie amino acid 
se<5Uence information selected from SEQ ID isO£: 1-502 to 7001 
or function information baied on the amino acxd sequence is 
recorded, and is usable in the system of (25) or (2 9) or 
the method of* (26) or (30) . 

(37) The reco«,U":n n^o -ra or storage devic according to 
(35) or (36) , wirv -• is a computer readacle rec ording medium 
selected from the g-oup consisting of a floppy disc, a hard 
disc, a magnetic r,ape^ a random access ciemor^ 'PAM) f a read 
only memory {ROM} , a magneto-optic disc (MO) , CD-kOM, CD-R, 
CD-RW, DVD-ROM, DVD-PV. and DVD-RW . 

(38) A polyp i ■ d^ no v ig a hon '.'.-'-'.ne 6 ohvcf rogena s e 
activity ✓ compri*. : r,^ ^* <^in-\ acid ^^qu^iice * which the 
val residue at Jtr*-. $±S* ^n the amino ac sequence of 
homos erine dehydrogenase d^ived froir ^ coryneform 
bacterium is replaced with an amino acid residue other than 

a Val residue. 

(39) A polypeptj'^** corepr;, sing an amino ac^c' ,<^quence in 
which the Val re^aue at the 59th position zhe amino 
acid sequence as repress u^d by SEQ JD KO:6S»5^ Ls replaced 
with an amino acid residue other than -~* Val t'Mdue. 

(40) The polypeptide according to (U3) ox (3£> wherein 
the val residue at th*s 59th position * •=> repLaced with an 
Ala residue. 



19 - 



(41) A polypeptide having pyruvate carboxylase activity, 
comprising an amino acid, sequence in which the Pro residue 
at the 458th position in the amino acid sequence of 
pyruvate carboxylase derived from a coryneform bacterium is 
replaced with an amino acid residue other than a Pro 
residue . 

(42) A polypeptide comprising an amino acid sequence in 
which the Pro residue at \,he 458th position in the amino 
acid sequence represented by SSQ ID NO: 4265 is replaced 
with an amino acid residue other than a Pro residue- 

(43) The polypeptide according to (41) or {42 > f wherein 

the Pro residue at the 458th position is replaced with a 
Ser residue. 

(44) The polyps; • tide -e- wording to any of (38) to 
(43) , which is aer .yed troxr Corynebacterium s*lv .^aicum. 

(45) A DNA encoding the polypeptide of any one of (38) 
to (44) . * 

(46) A recombirant DMA comprising the DNA (45) . 

(47) A transformant comprising the recombinant DNA of 
<46) . 

(48) A trans formant comprising in its chromosome the DNA 
of (45) , 

(49) The transformant according to (47) or (48) , which 

is derived from a corynef orm bacterium. 

(50) The transformant according to (49) , which is 
derived from Coryn&bacteirlvm glutami cum. 
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(51) A method for producing I*- lysines, comprising: 
culturing the txansformant of any one of (47) to 

(50) in a medium to produce and accumulate L lysine in the 
medium , and 

recovering the L-lysine from the culture. 

(52) A method for breeding a coryneform bacterium using 
the nucleotide sequence information represented by SZQ ID 
NOS:l to 3431, comprising the following: 

(i) comparing a, nucleotide sequence of a genome or gene 
of a production strain derived a coryneform bacterium which 
has been subjected to mutation breeding so as lo produce at 
least one compound selected from an amino acid, a nucleic 
acid, a vitamin, a raxxrar-ie, an orgaric acid, and 
analogous there*-*: ^ fermentation sseT ; or* . with a 
corresponding nuc^*r;lde sequence in S£Q ID *^c, .1 to 3431; 

(ii) identifying a nutation point present in the 
production strain based on a result obtained by (i) ; 

(iii) introducing the mutation point into a coryneform 
bacterium which is free of the mutation point.; and 

(iv) examining productivity by the fermentation method 
of the compound selected in (i) of the coryneform bacterium 
obtained in (iii) 

(53) The method according to (52) , wherein the gene is a 
gene encoding an enzyme in a biosynthetic pathway or a 
signal transmission pathway. 



(54) The method according "to (52) , wherein the mutation 
point is a mutation point relating to a useful mutation 
which improves or stabilizes the productivity . 

(55) A method for breading a coryneform bacterium using 
the nucleotide sequence information represented by SEQ ID 
NOStl to 3431, comprising: 

(i) comparing a nucleotide sequence of a genome ox gene 
of a production strain derived a coryneform bacterium which 
has been subjected to mutation breeding so as to produce at 
least one compound selected from an amino acid, a nucleic 
acid, a vitamin a saccharide, an organic acid, and 
analogous thereof by a fermentation method, with a 
corresponding nucleoside s ^sruence in SEC ID tfCS : to 3431; 

(ii) identifying a "ovation poiit pr£6«r,t in the 
production strain based c. * result obtain bv n) ; 

(iii) deleting a mutation point from a coryneform 
bacterium having the mutation point; and 

(iv) examining productivity by the fermentation method 
of the compound selected in (i) of the coryneform bacterium 
obtained in (iii) . 

(56) The method according to (55) , wherein the gene is a 
gene encoding an enryme - in a biosynthetic pathway or a 
signal transmission pathway, 

(57) The method according to (55) , wherein the mutation 
point is a mutation point which decreases or destabilizes 
the productivity - 
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(58) A method for breeding a coryneform bacterium using 
the nucleotide sequence information represented by SEQ ID 
N0S:2 to 3431, comprising the following: 

<i) identifying an isozyme relating to biosynthesis of 

at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogous thereof, based on the nucleotide sequence 

information represented by SEQ ID NOS:2 to 3431; 

(ii) classifying the isozyme identified in (i) into an 
isozyme having the same activity; 

(iii) mutating all genes encoding the isozyme having the 
same activity simultaneously; and 

(iv) examining productivity by a fermentation method of 
the compound selected in (i) of the coryneform bacterium 
which have been transformed with the gene obtained in (iii) . 

<59) A method for breeding a coryneform bacterium using 
the nucleotide sequence information represented by SEQ ID 
NOS:2 to 3431, comprising the following: 

(i) arranging a function information of an open reading 
frame (ORF) represented by SEQ ID N0S:2 to 3431; 

(ii) allowing the arranged ORF to correspond to an 
enzyme on a known biosynthesis or signal transmission 
pathway; 

<iii) explicating an - unknown biosynthesis pathway or 
signal transmission pathway of a coryneform bacterium in 
combination with information relating known biosynthesis 
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pathway or signal transmission pathway of a coryneform 
bacterium; 

(iv) comparing th<^ pathway explicated in (iii) with a. 
biosynthesis pathway of a target useful product; and 

(v) trans genetically -varying a coryneform bacterium 
based on the nucleotide sequence information to either 
strengthen a pathway which is judged to be important in the 
biosynthesis of the target useful product in (iv) or weaken 
a pathway which is judged not to be important in the 
biosynthesis of the target useful product in (iv) , 

(60) A coryneform bacterium, bred by the method of any 
one of (52) to (59) . 

(61) The coryneform bacterium according to (60) , which 
is a microorganism belonging to the genus Corynebacterium , 
the genus Bre^ibacteiriuia, or the genus Miorobacterium . 

(62) The coryneform bacterium according to (61) , wherein 
the microorganism belonging to the genus Corynebac teriuza is 
selected from the group consisting of CorynebaateTlvm 
glutamlcvm, Coryn eba o t eri um aeetcaczdophllum , 
Corynebacterlvm acetoglutamicmn, Corynebaetezrlum callunae, 
Coryztebaeterium heroulis r Coryn eJba c t eri um l±JLxum, 
Coryneba.cterlwi melassecola , Corynebacterium 

th^xmo ami nogreneg , and Cozynebacteriiam ainmoaiagenes. 

(63) A method for producing at least one compound 
selected from an amino acid, a nucleic acid, a vitamin, a 
saccharide, an organic acid and an analogue thereof/ 
comprising: 
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cul taring a coryneform bacterium of any one of (60) 
to (62) in a medium to produce and accumulate at least one 
compound selected from an amino acid, a nucleic acid, a 
vitamin, a saccharide, an organic acid, and analogues 

thereof ; 

recovering the compound from the culture. 

(64) The method according to (63) , wherein the compound 
is L- lysine. 

(65) A method for identifying a protein relating to 
useful mutation based on proteome analysis, comprising the 
following : 

(i) preparing 

a protein derived from a bacterium of a production 
strain of a coryneform bacterium which has been subjected 

to mutation breeding by a fermentation process so as to 

produce at least one compound selected from an amino acid, 
a nucleic acid, a vitamin, a saccharide, an organic acid, 
and analogues thereof, and 

a protein derived from a bacterium of a parent 
strain of the production strain; 

(ii) separating the proteins prepared in (i) by two 
dimensional electrophoresis; 

(iii) detecting the separated proteins, and comparing an 
expression amount of the protein derived from the 
production strain with that derived from the parent strain; 



- 25 - 



(iv) treating the protein showing different expression 
amounts as a result of 'the comparison with a peptidase to 

extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide 
fragments obtained in (iv) ; and 

(vi) comparing the amino acid sequences obtained in (v) 
with the amino acid sequence represented by SEQ ID NOS:3502 
to 7001 to identifying the protein having the amino acid 
sequences . 

As used herein, the term "proteome", which is a 
coined word by combining "protein" with "genome", refers to 
a method for examining of a gene at the polypeptide level, 

(66) The method according to (65) , wherein the 
coryneform bacterium is a microorganism belonging to the 
genus Corynebacterium, the genus Brevibacterium, or the 
genus Microdbacterium . 

(67) The method according to (66) f wherein the 
microorganism belonging to the genus CoryneJbaot^rium is 
selected from the group consisting of Corynebacterzum 
glutamlcvm, Corynehacterlvm acetoaaldophllum, 
Coryziebacteri.uzn acetog-lutaiaicum, Corynebacterium callunae, 
Corynebacterium herculls, Corynebacterizim Ii2i urn, 
Corynebacfcex-i tun melassecoia , Corynebacterium 
thermoamxnogenes , and Corynebacterium ammonl&genes . 

(68) A biologically pure culture of" Corynebacterium 
glutamlcvm AHP-3 (FERM BP-7382) . 
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The present invention will be described below in 
more detail, based on the determination of the full 
nucleotide sequence of coryneform bacteria. 

1 - Determination of full nucleotide sequence of coryneform 
bacteria 

The term "coryneform bacteria" as used herein means 
a microorganism belonging to the genus Corynebacterium, the 
genus Bre*PuJbact:eriuzn or the genus Ml arahs. a. t erj. um as defined 
in Bezrgeys Manual of Determinative Bacteriology, 8: 599 
(1974) , 

Examples include Cozryruaha. cten um acetoacidopfciluin, 
Corynebacterium acetoglutzamlcum , Corynebacterlvm callunae, 
Coiryn eba c t or! um g-lutamlcum, CoirynBbactexrxum herculls, 
Coj?yneba.cter±um llllmn, Corynebact&rlum melassecola, 
Coxynebacterium thermoamino^penes , Brevibacfcexiuza 

sacdiarolyticum, BzQvxba.ctQT±um izanarlophllum, 

Breyibacterimn ros&um , Brevibacteritrm thiog-eni talis , 
Mlcrobacterium ammonxaphxlxm, and the like. 

Specific examples include Coryn^baatori um 
acetoacldophllvm ATCC 13870, Coryziebacterium 

acefcogvlufcaznicum ATCC 1580€, Corynebacterizim aallxmae ATCC 
15991, Corynebacterium ^lutajaicum ATCC 13032, 

Corynebacterium glutamlcum ATCC 13060, C&rynebaoterium 
glutamlcum ATCC 1382 6 (prior genus and species: 
Brevihacterl ua flavam t or Corynebacterium iacto^ermentunz) , 
Corynebacterium glutaiaicum ATCC 14020 (prior genus and 
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species: Brevibactenvm dxvarlcatvm) , Corynebacterlvm 
gliltamlcum ATCC 13869 (prior genus and species: 
Bzrev±baaterium laato Mermen tma) f Cozjnebacterlum herculis 
ATCC 13868, Corynebact&irlvm Ixllvtm ATCC 15990, 
Corynebacterimn melassecola ATCC 17965 , Corynebacterivtm 
thermoam ino&enes FEEM 9244, Brevlbact&rlum S3.aaha.xralyt.tavm 
ATCC 14066, Bir&ylhacterlvinL i wmarlophllum ATCC 14068, 
Brev±bacterlum roseum ATCC 13925, fi^evibacfcerium 
thlo&enl talis ATCC 19240/ MicrobacteriiZBi ammoniapiJi liizn ATCC 
15354, and the like. 

(1) Preparation of genome DNA of coryneform bacteria 

Coryneform bacteria can be cultured by a 
conventional method. 

Any o£ a natural medium and a synthetic medium can 
be used, so long as it is a medium suitable for efficient 
culturing of the microorganism, and it contains a carbon 
source, a. nitrogen source, an inorganic salt, and the like 
which can be assimilated by the microorganism. 

In CoryneJbacteriuia glut ami cum f for example, a BY 
medium (7 g/1 meat extract, 10 g/1 peptone, 3 g/1 sodium 
chloride, 5 g/1 yeast extract, pH 7.2) containing 1% of 
glycine and the like can be used. The culturing is carried 
out at 25 to 35°C overnight. 

After the completion of the culture, the cells are 
recovered from the culture by centrifugation. The 
resulting cells are washed with a washing solution. 
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Examples of the washing solution include STE buffer 
<io.3% sucrose, 25 mmoi/1 Tris hydrochloride , 25 ircnol/1 
ethylenediaminetetraacetic acid (hereinafter referred, to as 
"EDTA") f pH 8,0), and the like. 

Genome DNA can be obtained from the washed cells 
according- to a conventional method for obtaining genome DNA, 
namely, lysing the cell wall of the cells using a lysozyme 
and a surfactant (SDS, etc.), eliminating proteins and the 
like using a phenol solution and a phenol /chloroform 
solution, and then precipitating the genome DNA with 
ethanol or the like. Specifically, the following method 
can be illustrated. 

The washed cells are suspended in a washing 
solution containing 5 to 20 mg/1 lysozyme . After shaking, 
5 to 20% SDS is added to lyse the cells. In usual, shaking 
is gently performed at 25 to 40°c for 30 minutes to 2 hours. 
After shaking, the suspension is maintained at 60 to 70°C 
for 5 to 15 minutes for the lysis. 

After the lysis, the suspension is cooled to 
ordinary temperature , and 5 to 20 ml of Tris-neutralized 
phenol is added thereto, followed by gently shaking at room 
temperature for 15 to 45 minutes . 

After shaking, centrifugation (15,000 x g, 20 
minutes, 20°c) is carried out to fractionate the aqueous 
layer . 

After performing extraction with phenol /chloroform 
and extraction with chloroform (twice) in the same manner, 
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3 mol/1 sodium acetate solution (pH 5.2) and isopropanol 
are added to the aqueous layer at 1/10 times volume and 2 
times volume, of the aqueous layer, respectively, followed 
by gently stirring to precipitate the genome DNA. 

The genome DNA is dissolved again in a buffer 
containing 0.01 to 0,04 mg/ml RNase. As an example of the 
buffer, TE buffer (10 mmol/1 Tris hydrochloride, 1 mol/1 
edta, pH 8.0) can be used. After dissolving, the resultant 
solution is maintained at 25 to 40°C for 20 to 50 minutes 
and then extracted successively with phenol, 
phenol /chloroform and chloroform as in the above case. 

After the extraction, isopropanol precipitation is 
carried out and the resulting DNA. precipitate is washed 
with 70% ethanol , followed by air drying, and then 
dissolved in TE buffer to obtain a genome DNA solution. 

<2) Production of shotgun library 

A method for produce a genome DNA library using the 
genome DNA of the coryneform bacteria prepared in the above 
{1) include a method described in Molecular Cloning, A 
laboratory Manual, Second Edition (1989) (hereinafter 
referred to as "Molecular Cloning, 2nd ed. ") . In 
particular, the following method can be exemplified to 
prepare a genome DNA library appropriately usable in 
determining the full nucleotide sequence by the shotgun 
method . 



- 30 - 



To 0.01 mg of the genome DHA of the coryneform 
bacteria prepared in the above (1) r a buffer, such as TE 
buffer or the like, is added to give a total volume of 0.4 
ml. Then, the genome DNA is digested into fragments of 1 
to 10 kb with a sonicator (Yamato Power sonic Model 50) . 
The treatment with the sonicator is performed at an output 
of 20 continuously for 5 seconds. 

The resulting genome DMA fragments are blunt-ended 
using DNA blunting kit (manufactured by Takara Shu2o) or 
the like. 

The blunt-ended genome fragments are fractionated 
by agarose gel or polyacryl amide gel electrophoresis and 
genome fragments of l to 2 kb are cut out from the gel. 

To the gel, 0.2 -to 0.5 ml of a buffer for eluting 
DNA, such as MG elution buffer (0.5 mol/1 ammonium acetate/ 
10 mmol/1 magnesium acetate, 1 mmol/1 EDTA, 0,1% SDS) or 
the like, is added, followed by shaking at 25 to 40°C 
overnight to elute DNA. 

The resulting DNA eluate is treated with 
phenol /chloroform and then precipitated with ethanol to 
obtain a genome library insert. 

Thxs insert is ligated into a suitable vector, such 
as pUC18 5aaI/SAP (manufactured by Amersham Pharmacia 
Biotech) or the like, , using T4 ligase (manufactured by 
Takara Shuzo) or the like. The ligation can be carried out 
by allowing a mixture to stand at 10 to 20°C for 20 to 50 
hours . 
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The resulting ligation product is precipitated with 
ethanol and dissolved in 5 to 20 of TS buffer. 

Escherichia, coll is transformed in accordance with 
a conventional method using 0.5 to 2 |il of the ligation 
solution. Examples of the transformation method include 
the electroporation method using ELECTRO MAX DH10B 
(manufactured by Life Technologies) for Escherichia coll. 
The electroporation method can be carried out under the 
conditions as described in the manufacturer' s instructions . 

The transformed Escherichia coll is spread on a 
suitable selection medium containing agar, for example, LB 
plate medium containing 10 to 100 mg/1 ampicillin (LB 
medium (10 g/1 bactotrypton , 5 g/1 yeast extract, 10 g/1 
sodium chloride, pH 7.0} containing 1.6% of agar) when 
pUC18 is used as the cloning vector , and cultured therein. 

The transformant can be obtained as colonies formed 

on the plate medium. In this step, xt is possible to 
select the transformant having the recombinant DNA 
containing the genome DNA as white colonies by adding X-gal 
and iptg (isopropyl-p-thiogalactopyranoside) to the plate 
medium. 

The transformant is allowed to stand for culturing 
in a 96-well titer plate 'to which 0.05 ml of the LB medium 
containing 0-1 mg/ml of ampicillin has been added in each 
well- The resulting culture can be used in an experiment 
of (4) described below. Also, the culture solution can be 
stored at -80°C by adding 0.05 ml per well of the LB medium 
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containing 20% glycerol to the culture solution, followed 
by mixing, and the stored culture solution can be used at 

any time . 

(3) Production of cosmid library 

The genome DNA "(0.1 mg) of* the coryneform bacteria 
prepared in the above (1) is partially digested with a 
restriction enzyme, sucn as Sau3AI or the like, and then 
ultracentrifuged (26,000 rpm, 18 hours, 20°C) under a 10 to 
4 0% sucrose density gradient using a 10% sucrose buffer (1 
mol/1 NaCI, 20 mmol/1 Tris hydrochloride, 5 mmol/1 EDTA, 
10% sucrose, pH 8.0) and a 40% sucrose buffer (elevating 
the concentration of the 10% sucrose buffer to 40%) . 

After the centrifugation, the thus separated 
solution is fractionated into tubes in 1 ml per each tube. 
After confirming the DNA fragment size of each fraction by 
agarose gel electrophoresis , a fraction rich in DNA 
fragments of about 40 kb is precipitated with ethanol. 

The resulting DNA fragment is ligated to a cosmid 
vector having a cohesive end which can be ligated to the 

fragment. When the genome DNA is partially digested with 

Sau3AI, the partially digested product can be ligated to, 
for example, the BamHI site of superCosl (manufactured fc>y 
Stratagene) in accordance with the manufacture's 
instructions . 

The resulting ligation product is packaged using a 
packaging extract which can be prepared by a method 
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described in Molecular Cloning, 2nd ed. and then used in 
transforming Escherichia, coll. More specifically, the 
ligation product is packaged using, for example, a 
commercially available packaging extract, Gigapack III Gold 
Packaging Extract (manufactured by Stratagene) in 
accordance with the manufacture's instructions and then 
introduced into r^cherichia col± XI.-1-BlueMR (manufactured 
by Stratagene) or the like. 

The thus transformed Escherichia, coll is spread on 
an LB plate medium containing arnpicillin, and cultured 
therein . 

The trans formant can be obtained as colonies formed 
on the plate medium. 

The transformant is subjected to standing culture 
in a 96-well titer plate to which 0.05 ml of the LB medium 
containing 0.1 mg/ml arnpicillin has been added. 

The resulting culture can be employed in an 
experiment of (4) described below. Also, the culture 
solution can be stored at -80°C by adding 0.05 ml per well 
of the LB medium containing 20% glycerol to the culture 
solution, followed by mixing, and the stored culture 
solution can be used at any time. 
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(4) Determination of nucleotide sequence 
(4-1) Preparation of template 

The £ull nucleotide sequence of genome DNA of 

coryneform bacteria can be determined basically according 
to the whole genome shotgun method {Sclenae, 269: 496-512 
(1995)) . 

The template used in the whole genome shotgun 
method can be prepared by PCR using the library prepared in 
the above (2) (UNA Research, 5: 1-9 (1998)). 

Specifically, the template can be prepared as 

follows . 

The clone derived from the whole genome shotgun 

library is inoculated by using a replicator (manufactured 

by CENETIX) into each well of a 9 € -well plate to which 0.08 
ml per well of the LB medium containing 0 . 1 mg/ml 
ampicillin has been added, followed by stationarily 
culturing at 37°C overnight. 

Next, the culture solution is transported, using a 
copy plate (manufactured by Tokken) , into each well of a 
96-well reaction plate (manufactured by P£ Biosystems) to 
which 0 . 025 ml per well of a PCR reaction solution has been 
added using TaKaRa Ex Taq (manufactured by Takara Shuzo) * 

Then, PCR is carried out in accordance with the protocol by 
Makino et al. (DNA Research; 5: 1-9 (1998)) using GeneAmp 
PCR System 9700 (manufactured by PE Biosystems) to amplify 
the inserted fragments . 
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The excessive primers and nucleotides axe 
eliminated using a kit for purifying a PCR product, and the 
product is used as the template in the sequencing reaction. 

It is also possible to determine the nucleotide 
sequence using a double- stranded DHA. plasmid as a template. 

The double- stranded DNA plasmid used as the 
template can be obtained by the following method. 

The clone derived from the whole genome shotgun 
library is inoculated into each well of a 24- or 96-vell 
plate to which 1.5 ml per well of a 2 x YT medium (16 g/1 
bactotrypton , 10 g/1 yeast extract, 5 g/1 sodium chloride, 
pH 7,0) containing 0.05 mg/ml ampicillin has been added, 
followed by cultunng under shaking at 37 C C overnight. 

The double -stranded DNA plasiaid can be prepared 
from the culture solution using an automatic plasmid 
preparing machine KUFABO PI-50 (manufactured by Kurabo 
industries) , a multiscreen (manufactured by Millipore) or 
the like, according to each protocol. 

To purify the plasmid, BiomeJe 2000 manufactured by 
Beckman Coulter and the like can be used. 

The resulting purified double-stranded DNA plasmid 
is dissolved in water to give a concentration of about 0.1 
mg/ml. Then, it can be used as the template in sequencing. 
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(4-2) Sequencing reaction 

The sequencing reaction can be carried out 
according to a commercially available sequence kit or the 
like. A specific method is exemplified below. 

To 6 |il of a solution of ABI PRISM BigDye 
Terminator Cycle Sequencing Ready Reaction Kit 
(manufactured by PE Biosys terns) , 1 to 2 pmol of an Ml 3 
regular direction primer (M13-21) or an Ml3 reverse 
direction primer (M13REV) (DNA Research, 5: 1-9 (1998)) and 
50 to 200 ng of the template prepared in the above (4-1) 
(the PGR product or plasmid) to give 10 ]Xl of a sequencing 
reaction solution, 

A dye terminator sequencing reaction (35 to 55 

cycles) is carried out using this reaction solution and 
GeneAmp PGR System 9700 (manufactured by PE Biosys terns) or 
the like. The cycle parameter can be determined in 
accordance with a commercially available kit, for example, 
the manufacture's instructions attached with ABI PRISM Big 
Dye Terminator Cycle Sequencing Ready Reaction Kit. 

The sample can "be purified using a commercially 
available product, such as Multi Screen HV plate 
(manufactured by Millipore) or the like, according to the 
manufacture * s instructions . 

The thus purified reaction product is precipitated 
with ethanol, dried and then used for the analysis. The 
dried reaction product can be stored in the dark at -30°C 
and the stored reaction product can be used at any time. 
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The dried reaction product can be analyzed using a 
commercially available sequencer and an analyzer according 

to the manufacture's instructions. 

Examples of the commercially available sequencer 
include ABI PRISM 377 DNA Sequencer (manufactured by PE 
Biosystems) . Example of the analyzer include ABI PRISM 
3700 DNA Analyzer (manufactured by PE Biosystems) . 

(S) Assembly 

A software, such as phred (The University of 
Washington) or the like, can be used as base call for use 
in analyzing the sequence information obtained in the above 
(4) . A software, such as Cross^Match (The University of 
Washington) or SPS Cross_Match (manufactured by Southwest 

Parallel Software) or the like, can be used to mask the 

vector sequence information. 

For the assembly, a software, such as phrap (The 
University of Washington) , SPS phrap (manufactured by 
Southwest Parallel Software) or the like, can be used. 

In the above, analysis and output of the results 
thereof, a computer such as UNIX, PC, Macintosh, and the 
like can be used. 

Contig obtained by the assembly can be analyzed 
using a graphical editor such as consed (The University of 
Washington) or the like. 
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It is also possible to perform a series of the 
operations from the base call to the assembly in a lump 
using a script phredPhrap attached to the consed. 

As used herein r software will be understood to also 
be referred to as a comparator. 

(6) Determination of nucleotide sequence in gap part 

Each of the cosmids in the cosmid library 
constructed in the above (3) is prepared in the same manner 
as in the preparation of the double- stranded DNA plasmid 
described in the above (4-1) . The nucleotide sequence at 
the end of the insert fragment of the cosmxd is determined 
using a commercially available kit/ such as ABI PRISM 
BigDye Terminator Cycle Sequencing Ready Reaction Kit 
(manufactured by PE Biosys terns) according to the 

manufacture's instructions-. 

About 800 cosmid clones are sequenced at both ends 

of the inserted fragment to detect a nucleotide sequence in 
the con tig derived from the shotgun sequencing obtained in 
(5) which is coincident with the sequence. Thus, the chain 
linkage between respective cosmid clones and respective 
contigs are clarified, and mutual alignment is carried out. 
Furthermore, the results are compared with known physical 
maps to map the cosmids and the contigs . In case of 
Coryne&act&xrlum g-lutamlavm ATCC 13032, a physical map of 
Mol. Gen. Genet. , 252: 255-265 (1996) can be used. 
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The sequence in the region which cannot be covered 
with the con tigs (gap part) can be determined by the 
following method. 

Clones containing sequences positioned at the ends 

of" the con tigs are selected. Among these t a clone wherein 
only one end of the inserted fragment has been determined 
is selected and the sequence at the opposite end of the 
inserted fragment is determined. 

A shotgun library clone or a cosmid clone derived 
therefrom containing the sequences at the respective ends 
of the inserted fragments in the two contigs is identified 

and the full nucleotide sequence of the inserted fragment 

of the clone is determined. 

According to this method, the nucleotide sequence 
of the gap part can be determined. 

When no shotgun library clone or cosmid clone 
covering the gap part is * available, primers complementary 
to the end sequences of the two different contigs are 
prepared and the DNA fragment in the gap part is amplified. 
Then, sequencing is performed by the primer walking method 
using the amplified DNA fragment as a template or by the 
shotgun method in which the sequence of a shotgun clone 

prepared from the amplified DNA fragment is determined. 
Thus, the nucleotide sequence of the above-described region 
can be determined. 

In a region showing a low sequence accuracy, 
primers are synthesized using AUTOFINISH function and 

- 40 - 



NAVIGATING function of consed (The University of 
Washington) , and the sequence is determined by the primer 
walking method to improve the sequence accuracy. 

Examples of the thus determined nucleotide sequence 
of the full genome include the full nucleotide sequence of 
genome of Corynebacterxum glutajnicum ATCC 13032 represented 

by SEQ ID NO:l, 

(7) Determination of nucleotide sequence of microorganism 
genome DNA using the nucleotide sequence represented by SEQ 

ID NO:l 

A nucleotide sequence of a polynucleotide having a 
homology of 80% or more with the full nucleotide sequence 
of Corynebacterium grlutazaxcum ATCC 13032 represented by SEQ 
ID N0:1 as determined above can also be determined using 
the nucleotide sequence represented by SEQ ID NO:l, and the 
polynucleotide having t a nucleotide sequence having a 
homology of 80% or more with the nucleotide sequence 
represented by SEQ ID NO:l of the present invention is 
within the scope of the present invention. The term 
"polynucleotide having a nucleotide sequence having a 
homology of 90% or more with the nucleotide sequence 
represented by SEQ ID NO:l of the present invention" is a 
polynucleotide in which a full nucleotide sequence of the 
chromosome DNA" can be determined using as a primer an 
oligonucleotide composed of continuous 5 to 50 nucleotides 
in the nucleotide sequence ■ represented by SEQ ID KO: 1 , for 
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example, according to PCR using the chromosome DKA as a 
template- a particularly preferred primer in determination 
Of the full nucleotide sequence is an oligonucleotide 
having nucleotide sequences which are positioned at the 
interval of about 300 to 500 bp, and among such 
oligonucleotides, an oligonucleotide having a nucleotide 
sequence selected from DNAs encoding a protein relating to 
a main metabolic pathway is particularly preferred. The 
polynucleotide in which the full nucleotide sequence of the 
chromosome DNA can be determined using the oligonucleotide 
includes polynucleotides constituting a chromosome DNA 
derived from a microorganism belonging to coiryneform 
bacteria- Such a polynucleotide is preferably a 

polynucleotide constituting chromosome DNA. derived from a 
microorganism belonging to the genus Ccrynebacteirlum, more 
preferably a polynucleotide constituting a chromosome DMA 
of Corynebacterium glutamicuzn. 
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2. Identification of ORF (open reading frame) and 
expression regulatory fragment and determination of the 
function of ORF 

Based on the full nucleotide sequence data of the 
genome derived from coryneform bacteria determined in the 
above item 1, an ORF and an expression modulating fragment 
can be identified. Furthermore , the function of the thus 
determined ORF can be determined. 

The ORF means a continuous region in the nucleotide 
sequence of mRNA which can be translated as an amino acid 
sequence to mature to a protein. A region of the DNA 
coding for the ORF of mRNA is also called ORF. 

The expression modulating fragment (hereinafter 
referred to as "EMF") is used herein to define a series of 
polynucleotide fragments which modulate the expression of 
the ORF or another sequence ligated opera tably thereto. 
The expression "modulate the expression of a sequence 
ligated operatably" is used herein to refer to changes in 
the expression of a sequence due to the presence of the EMF. 
Examples of the EMF include a promoter, an operator, an 
enhancer, a silencer , a ribosome-binding sequence, a 
transcriptional termination sequence, and the like. In 
coryneform bacteria, an EMF is usually present in an 
intergenic segment (a fragment positioned between two 
genes; about 10 to 200 nucleotides in length). Accordingly, 
an EMF is frequently present in an intergenic segment of 10 
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nucleotides or longer. It is also possible to determine or 
discover the presence of" an EMF by using known EMF 
sequences as a target sequence or a target structural motif 
(or a target motif) using an appropriate software or 
comparator f such as FASTA (Proc. Natl. Acad. Sal. USA, 
85: 2444-48 (1988) ) , BLAST (J. Mbl. Biol. , 215: 403-410 
(1990)) or the like. Also, it can be identified and 
evaluated using a known EMF-capturing vector (for example , 
pKK232-8; manufactured by Amersham Pharmacia Biotech) . 

The term "target sequence" is used, herein to refer 
to a nucleotide sequence composed of 6 or more nucleotides, 
an amino acid sequence composed of 2 or more amino acids , 
or a nucleotide sequence encoding this amino acid sequence 
composed of 2 or more amino acids . A longer target 
sequence appears at random in a data base at the lower 
possibility. The target sequence is preferably about 10 to 
100 amino acid residues or about 30 to 300 nucleotide 
residues . 

The term "target structural motif" or "target 
motif" is used herein to refer to a sequence or a 
combination of sequences selected optionally and reasonably 
Such a motif is selected on the basis of* the three- 
dimensional structure formed by the folding of a 
polypeptide by means known to one of ordinary skill in the 
art. Various motives are known. 
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Examples of the target motif of a polypeptide 
include, but are not limited to, an enzyme activity site, a 
protein-protein interaction site, a signal sequence, and 
the like. Examples of the target motif of a nucleic acid 
include a promoter sequence, a transcriptional regulatory 
factor binding sequence, a hair pin structure , and the like. 

Examples of highly useful EMF include a high- 
expression promoter, an inducible-expression promoter, and 
the like. Such an EMF can be obtained by positionally 
determining the nucleotide sequence of a gene which is 
known or expected as achieving high expression (for example, 
ribosomal RNA gene: GenBank Accession No. M16175 or E46753) 
or a gene showing a desired induction pattern (for example, 
isocitrate lyase gene induced by acetic acid: Japanese 
Published Unexamined Patent Application No. 56782/93) via 
the alignment with the full genome nucleotide sequence 
determined in the above item 1, and isolating the genome 
fragment in the upstream part (usually 200 to 500 
nucleotides from the translation initiation site) . It is 
also possible to obtain a highly useful EMF by selecting an 
EMF showing a high expression efficiency or a desired 
induction pattern from among promoters captured by the EMF- 
capturing vector as described above. 

The ORF can be identified by extracting 
characteristics common to individual ORFs , constructing a 
general model based on these characteristics, and measuring 
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the conformity of* the subject sequence with the model. In 
the identification, a software, such as GeneMark (Nuc. 
Acids. Res., 22: 4756-67 (1994): manufactured by GenePro) ) , 
GeneMark. hmm (manufactured by GenePro) , GeneHacker (Protein, 
Nucleic Acid and Enzyme, 42: 3001-07 (1997)), Glimmer (Nuc. 
Acids. Res., 26: 544-548 (1998); manufactured by The 
Institute of Genomic Research) , or the like, can be used, 
in using the software, the default (initial setting) 
parameters are usually used, though the parameters can be 
optionally changed. 

In the above -de scribed comparisons, a computer, 
such as UNIX, PC, Macintosh, or the like, can be used. 

Examples of the ORF determined by the method of the 
present invention include ORFs having the nucleotide 
sequences represented by SEQ id nos:2 to 3501 present in 
the genome of Cozrynehacterlum. g-lutamicum as represented by 
SEQ ID NO:l. In these ORFs, polypeptides having the amino 
acid sequences represented by SEQ ID NOS:3502 to 7001 are 
encoded . 

The function of an ORF can be determined by 
comparing the identified amino acid sequence of the ORF 
with known homologous sequences using a homology searching 
software or comparator, such as BLAST, FAST, Smith & 
Waterman (Meth. Enzym. , 164: 765 (1988)) or the like on an 
amino acid data base, such as Swith-Prot, PIR, GenBank-nr- 
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aa, GenPept constituted by protein-encoding domains derived 
from GenBank data base, OWI* or the like. 

Furthermore , by the homology searching, the 
identity and similarity with the amino acid sequences of 
known proteins can also be analyzed. 

With respect of the term "identity" used herein, 
where two polypeptides each having 10 amino acids are 
different in the positions of 3 amino acids , these 
polypeptides have an identity of 70% with each other. In 
case wherein one of the different 3 amino acids is analogue 
(for example, leucine and isoleucine) f these polypeptides 
have a similarity of 80%. 

As a specific example, Table 1 shows the 
registration numbers in known data bases of sequences which 
are judged as having the highest similarity with the 
nucleotide sequence of the ORF derived from CoryneJbactejrium 
glutamlcum ATCC 13032, genes of these sequences, functions 
of these genes, and identities thereof compared with known 
amino acid translation sequences . 

Thus, a great number of novel genes derived from 
coryneform bacteria can be identified by determining the 
full nucleotide sequence of the genome derived from 
coryneform bacterium by the means of the present invention. 
Moreover, the function of the proteins encoded by these 
genes can be determined. Since coryneform bacteria are 
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industrially highly useful microorganisms , many of the 
identified genes are industrially useful. 

Moreover, the characteristics of respective 
microorganisms can be clarified by classifying the 
functions thus determined. As a result , valuable 

information in breeding is obtained. 

Furthermore , from the ORF information derived from 
coryneform bacteria, the ORF corresponding to the 
microorganism is prepared and obtained according to the 
general method as disclosed in Molecular Cloning, 2nd ed* 
or the like. Specifically, an oligonucleotide having a 
nucleotide sequence adjacent to the ORF is synthesized, and 
the ORF can be isolated and obtained using the 
oligonucleotide as a primer and a chromosome dna derived 
from coryneform bacteria as a template according to the 
general PCR cloning technique. Thus obtained ORF sequences 
include polynucleotides comprising the nucleotide sequence 
represented by any one of SEQ ID NOS:2 to 3501. 

The ORF or primer can be prepared using a 
polypeptide synthesizer based on the above sequence 
information . 

Examples of the polynucleotide of the present 
invention include a polynucleotide containing the 
nucleotide sequence of the ORF obtained in the above f and a 
polynucleotide which hybridizes with the polynucleotide 
under stringent conditions. 
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The polynucleotide of the present invention can be 
a single- stranded DNA, a double -stranded DNA and a single- 
stranded RNA, though it is not limited thereto. 

The polynucleotide which hybridizes with the 
polynucleotide containing the nucleotide sequence of the 
ORF obtained in the above under stringent conditions 
includes a degenerated mutant of the ORF. A degenerated 
mutant is a polynucleotide fragment having a nucleotide 
sequence which is different from the sequence of the orf of 
the present invention which encodes the same amino acid 
sequence by degeneracy of a gene code. 

Specific examples include a polynucleotide 
comprising the nucleotide sequence represented by any one 
of SEQ id NOS:2 to 3431 , and a polynucleotide which 
hybridizes with the polynucleotide under stringent 
conditions . 

A polynucleotide which hybridizes under stringent 
conditions is a polynucleotide obtained by colony 
hybridization , plaque hybridization, Southern blot 
hybridization or the like using, as a probe, the 
polynucleotide having the nucleotide sequence of the ORF 
identified in the above. Specific examples include a 
polynucleotide which can be identified by carrying out 
hybridization at 65°C in the presence of 0,7-1.0 M NaCl 
using a filter on which a polynucleotide prepared from 
colonies or plaques is immobilized, and then washing the 
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filter with 0,lx to 2x SSC solution (the composition of Ix 
SSC contains 150 iriM sodium chloride and IS mM sodium 
citrate) at 65°C, 

The hybridization can be carried out in accordance 
with known methods described in, for example, Molecular 
Cloning-, 2nd ed. , Current Protocols In Molecular Biology, 
DNA Cloning 1: Core Techniques, A Practical Approach, 
Second Edition, Oxford University (1995) or the like. 
Specific examples of the polynucleotide which can be 
hybridized include a DNA having a homology of 60% or more, 
preferably 80% or more, and particularly preferably 95% or 
more, with the nucleotide sequence represented by any one 
of SEQ ID NO: 2 to 3431 when calculated using default 
(initial setting) parameters of a homology searching 
software, such as BLAST, FASTA, Smith -Waterman or the like. 

Also, the polynucleotide of the present invention 
includes a. polynucleotide encoding a polypeptide comprising 
the amino acid sequence represented by any one of SEQ ID 
NOS:3502 to 6931 and a polynucleotide which hybridizes with 
the polynucleotide under stringent conditions. 

Furthermore, the polynucleotide of the present 
invention includes a polynucleotide which is present in the 
5 1 upstream or 3 1 downstream region of a polynucleotide 
comprising the nucleotide sequence of any one of SEQ ID 
NOS:2 to 3431 in a polynucleotide comprising the nucleotide 
sequence represented by SEQ ID N0:1, and has an activity of 
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regulating an expression of a polypeptide encoded by the 
polynucleotide. Specific examples of the polynucleotide 
having an activity of regulating an expression of a 
polypeptide encoded by the polynucleotide includes a 
polynucleotide encoding the above described EMF, such as a 
promoter, an operator , an enhancer, a silencer, a ribosome- 
binding sequence, a transcriptional termination sequence, 
and the like. 

The primer used for obtaining the ORF according to 
the above PCR cloning technique includes an oligonucleotide 
comprising a sequence which is the same as a sequence of 10 
to 200 continuous nucleotides in the nucleotide sequence of 
the ORF and an adjacent region or an oligonucleotide 
comprising a sequence which is complementary to the 
oligonucleotide. Specific examples include an 

oligonucleotide comprising a sequence which is the same as 
a sequence of 10 to 200 continuous nucleotides of the 
nucleotide sequence represented by any one of SEQ ID NOS : 1 
to 3431, and an oligonucleotide comprising a sequence 
complementary to the oligonucleotide comprising a sequence 
of at least 10 to 20 continuous nucleotide of any one of 
SEQ ID NOS:l to 3431. When the primers are used as a sense 
primer and an antisense primer, the above-described 
oligonucleotides in which melting temperature (T m ) and the 
number of nucleotides are not significantly different from 
each other are preferred. 
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The oligonucleotide of the present invention 
includes an oligonucleotide comprising a sequence which is 
the same as 10 to 200 continuous nucleotides of the 
nucleotide sequence represented by any one of SEQ id NOS : 1 
to 3431 or an oligonucleotide comprising a sequence 
complementary to the oligonucleotide. 

Also, analogues of these oligonucleotides 
{hereinafter also referred to as "analogous 
oligonucleotides") are also provided by the present 
invention and are useful in the methods described herein. 

Examples of the analogous oligonucleotides include 
analogous oligonucleotides in which a phosphodiester bond 
in an oligonucleotide is converted to a phosphorothioate 
bond, analogous oligonucleotides in which a phosphodiester 
bond in an oligonucleotide is converted to an N3 f -P5' 
phosphoamidate bond, analogous oligonucleotides in which 
ribose and a phosphodiester bond in an oligonucleotide is 
converted to a peptide nucleic acid bond, analogous 
oligonucleotides in which uracil in an oligonucleotide is 
replaced with C-5 propynyl uracil , analogous 

oligonucleotides in which uracil in an oligonucleotide is 
replaced with C-5 thiazoluracil , analogous oligonucleotides 
in which cytosine in an oligonucleotide is replaced with 
C-5 propynyl cytosine f analogous oligonucleotides in which 
cytosine in an oligonucleotide is replaced with 
phenoxazine-modif ied cytosine, analogous oligonucleotides 
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in which ribose in an oligonucleotide is replaced with 
2 r -0-propylribose, analogous oligonucleotides in which 
ribose in an oligonucleotide is replaced with 
2 ' -methoxyethoxyribose , and the like (Cell Engineering, 
16: 1463 (1997)) . 

The above oligonucleotides and analogous 
oligonucleotides of the present invention can be used as 
probes for hybridization and antisense nucleic acids 
described below in addition to as primers. 

Examples of a primer for the antisense nucleic acid 
techniques known in the art include an oligonucleotide 
which hybridizes the oligonucleotide of the present 
invention under stringent conditions and has an activity 
regulating expression of the polypeptide encoded by the 
polynucleotide, in addition to the above oligonucleotide. 

3. Determination of isozymes 

Many mutants of coryneform bacteria which are 
useful in the production of useful substances, such as 
amino acids , nucleic acids, vitamins, saccharides, organic 
acids, and the like, are obtained by the present invention. 

However, since the gene sequence data of the 
microorganism has been, to date, insufficient, useful 
mutants have been obtained by mutagenic techniques using a 
mutagen, such as nitrosoguanidine (NTG) or the like. 
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Although genes can be mutated randomly by the 
mutagenic method using the above-described mutagen , all 
genes encoding respective isozymes having similar 
properties relating to the metabolism of intermediates 
cannot be mutated. In the mutagenic method using a mutagen, 
genes are mutated randomly. Accordingly, harmful mutations 
worsening culture characteristics, such as delay in growth, 
accelerated foaming, and the like, might be imparted at a 
great frequency , in a random manner. 

However, if gene sequence information is available, 
such as is provided by the present invention, it is 
possible to mutate all of the genes encoding target 
isozymes* In this case, harmful mutations may be avoided 
and the target mutation can be incorporated. 

Namely, an accurate number and sequence information 
of the target isozymes in coryneform bacteria can be 
obtained based on the ORF data obtained in the above item 2. 
By using the sequence information, all of the target 
isozyme genes can be mutated into genes having the desired 
properties by, for example, the site-specific mutagenesis 
method described in Molecular Cloning, 2nd ed, to obtain 
useful mutants having elevated productivity of useful 
substances . 
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4 . Clarification or determination of biosynthesis pathway 
and signal transmission pathway 

Attempts have been made to elucidate biosynthesis 
pathways and signal transmission pathways in a number of 
organisms f and many findings have been reported. However, 
there are many unknown aspects of coryneform bacteria since 
a number of genes have not been identified so far. 

These unknown points can be clarified by the 
following method. 

The functional information of ORF derived from 
coryneform bacteria as identified by the method of above 
item 2 is arranged. The term "arranged" means that the ORF 
is classified based on the biosynthesis pathway of a 
substance or the signal transmission pathway to which the 
ORF belongs using known information according to the 
functional information. Next, the arranged ORF sequence 
information is compared with enzymes on the biosynthesis 
pathways or signal transmission pathways of other known 
organisms. The resulting information is combined with 
known data on coryneform bacteria. Thus, the biosynthesis 
pathways and signal transmission pathways in coryneform 
bacteria, which have been unknown so far, can be determined 

As a result that these pathways which have been 
unknown or unclear hitherto are clarified , a useful mutant 
for producing a target useful substance can be efficiently 
obtained . 
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When the thus clarified pathway is judged as 
important in the synthesis of a useful product, a useful 
mutant can be obtained by selecting a mutant wherein this 
pathway has been strengthened- Also, when the thus 
clarified pathway is judged as not important in the 
biosynthesis of the target useful product, a useful mutant 
can be obtained by selecting a mutant wherein the 
utilization frequency of this pathway is lowered. 

5. Clarification or determination of useful mutation point 

Many useful mutants of coryneform bacteria which 
are suitable for the production of useful substances, such 
as amino acids, nucleic acids, vitamins, saccharides, 
organic acids, and the like, have been obtained. However, 
it is hardly known which mutation point is imparted to a 
gene to improve the productivity. 

However, mutation points contained in production 
strains can be identified by comparing desired sequences of 
the genome DNA of the production strains obtained from 
coryneform bacteria by the mutagenic technique with the 
nucleotide sequences of the corresponding genome DNA and 
ORF derived from coryneform bacteria determined by the 
methods of the above items 1 and 2 and analyzing them 

Moreover, effective mutation points contributing to 
the production can be easily specified from among these 
mutation points on the basis of known information relating 
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to the metabolic pathways, the metabolic regulatory 
mechanisms, the structure activity correlation of enzymes, 
and the like. 

When any efficient mutation can be hardly specified 
based on known data, the mutation points thus identified 
can be introduced into a wild strain of coryneform bacteria 
or a production strain free of the mutation. Then, it is 
examined whether or not any positive effect can be achieved 
on the production. 

For example, by comparing the nucleotide sequence 
of homoserine dehydrogenase gene horn of a lysine-producing 
B-6 strain of Corynebacterlvm glutamlaum (Appl. Microbiol . 
Blotechnol., 32: 269-273 (1989)) with the nucleotide 
sequence corresponding to the genome of Corynebacterlum 
glutamicvm ATCC 13032 according to the present invention, a 
mutation of amino acid replacement in which valine at the 
59-position is replaced with alanine (Val59Ala) was 
identified, A strain obtained by introducing this mutation 
into the ATCC 13032 strain by the gene replacement method 
can produce lysine, which indicates that this mutation is 
an effective mutation contributing to the production of 
lysine . 

Similarly, by comparing the nucleotide sequence of 
pyruvate carboxylase gene pyc of the B-6 strain with the 
nucleotide sequence corresponding to the ATCC 13032 genome, 
a mutation of amino acid replacement in which proline at 
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the 458-position was replaced with serine (Pro458Ser) was 
identified. A strain obtained by introducing this mutation 
into a lysine-producing strain of No. 58 (FERM BP-7134) of 
CorynebBcterlvm glutamlcvm free of this mutation shows an 
improved lysine productivity in comparison with the No. 58 
strain, which indicates that this mutation is an effective 
mutation contributing to the production of lysine. 

In addition, a mutation Ala213Thr in glucose-6- 
phosphate dehydrogenase was specified as an effective 
mutation relating to the production of lysine by detecting 
glucose-6-phosphate dehydrogenase gene zwf of the B-6 
strain . 

Furthermore, the lysine -productivity of 

Coz-yn eh a. c t ejri um glutamlcum was improved by replacing the 
base at the 932 -position of aspartokinase gene lysC of the 
Cozrynebact&rium glutamlcum ATCC 13032 genome with cytosine 
to thereby replace threonine at the 311-position by 
isoleucine, which indicates that this mutation is an 
effective mutation contributing to the production of lysine. 

Also, as another method to examine whether or not 
the identified mutation point is an effective mutation, 
there is a method in which the mutation possessed by the 
ly sine-producing strain is returned to the sequence of a 
wild type strain by the gene replacement method and whether 
or not it has a negative influence on the lysine 
productivity. For example, when the amino acid replacement 
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mutation Val59Ala possessed by horn of the lysine-producing 
B-6 strain was returned to a wild type amino acid sequence, 
the lysine productivity was lowered in comparison with the 
B-6 strain. Thus , it was found that this mutation is an 
effective mutation contributing to the production of lysine. 
Effective mutation points can be more efficiently 
and comprehensively extracted by combining, if needed, the 
DNA array analysis or proteome analysis described below. 

6. Method of breeding industrially advantageous production 
strain 

It has been a general practice to construct 
production strains, which are used industrially in the 
fermentation production of the target useful substances , 
such as amino acids, nucleic acids, vitamins, saccharides, 
organic acids, and the like, by repeating mutagenesis and 
breeding based on random mutagenesis using mutagens , such 
as NTG or the like, and screening. 

In recent years, many examples of improved 
production strains have been made through the use of 
recombinant DNA techniques . In breeding, however, most of 
the parent production strains to be improved are mutants 
obtained by a conventional mutagenic procedure (W. 
Leuchtenberger, Amino Acids - Technical Production and Use. 
In: Roehr (ed) Biotechnology, second edition, vol. 6, 
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products of primary metabolism. VCH Verlagsgesellschaf t mbH, 
Weinheim, P 465 (1996)). 

Although mutagenesis methods have largely 
contributed to the progress of the fermentation industry , 
they suffer from a serious problem of multiple, random 
introduction of mutations into every part of the chromosome. 
Since many mutations are accumulated in a single chromosome 
each time a strain is improved, a production strain 
obtained by the random mutation and selecting is generally 
inferior in properties (for example, showing poor growth, 
delayed consumption of saccharides, and poor resistance to 
stresses such as temperature and oxygen) to a wild type 
strain, which brings about troubles such as failing to 
establish a sufficiently elevated productivity, being 
frequently contaminated with miscellaneous bacteria, 
requiring troublesome procedures in culture maintenance, 
and the like, and, in its turn, elevating the production 
cost in practice. In addition, the improvement in the 
productivity is based on random mutations and thus the 
mechanism thereof is unclear. Therefore, it is very 
difficult to plan a rational breeding strategy for the 
subsequent improvement in the productivity. 

According to the present invention, effective 
mutation points contributing to the production can be 
efficiently specified from among many mutation points 
accumulated in the chromosome of a production strain which 
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has been bred from coryneform bacteria and, therefore, a 
novel breeding method of assembling these effective 
mutations in the coryneform bacteria can be established. 
Thus, a useful production strain can be reconstructed. It 
is also possible to construct a useful production strain 
from a wild type strain. 

Specifically, a useful mutant can be constructed in 
the following manner. 

One of the mutation points is incorporated into a 
wild type strain of coryneform bacteria. Then, it is 
examined whether or not a positive effect is established on 
the production. When a positive effect is obtained, the 
mutation point is saved. When no effect is obtained, the 
mutation point is removed. Subsequently, only a strain 
having the effective mutation point is used as the parent 
strain, and the same procedure is repeated. In general, 
the effectiveness of a mutation positioned upstream cannot 
be clearly evaluated in some cases when there is a rate- 
determining point in the downstream of a biosynthesis 
pathway. It is therefore preferred to successively 

evaluate mutation points upward from downstream. 

By reconstituting effective mutations by the method 
as described above in a wild type strain or a strain which 
has a high growth speed or the same ability to consume 
saccharides as the wild type strain, it is possible to 
construct an industrially advantageous strain which is free 
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of troubles in the previous methods as described above and 
to conduct fermentation production using such strains 
within a short time or at a higher temperature* 

For example, a lysine-producing mutant B-6 (A.ppl . 
Microbial. Bio techno 1. , 32: 262-273 (1989)), which is 
obtained by multiple rounds of random mutagenesis from a 
wild type strain Coiynehacterium g-lutztmlcum ATCC 13032 , 
enables lysine fermentation to be performed at a 
temperature between 30 and 34 °C but shows lowered growth 
and lysine productivity at a temperature exceeding 34°C. 
Therefore, the fermentation temperature should be 
maintained at 34°C or lower. In contrast thereto, the 
production strain described in the above item 5, which is 
obtained by reconstituting effective mutations relating to 
lysine production, can achieve a productivity at 40 to 42°C 
equal or superior to the result obtained by culturing at 30 
to 34°C. Therefore, this strain is industrially 

advantageous since it can save the load of cooling during 
the fermentation. 

When culture should be carried out at a high 
temperature exceeding 43°C / a production strain capable of 
conducting fermentation production at a high temperature 
exceeding 43°C can be obtained by reconstituting useful 
mutations in a microorganism belonging to the genus 
Corynebactej-lum which can grow at high temperature 
exceeding 43°C. Examples of the microorganism capable of 
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growing at a high temperature exceeding 43°C include 
Coryn eJba c t eri uzn theoioaminogenes , such as Corynebacterium 
thexmoainxnogenes PERM 9244, PERM 9245, PERM 9246 and 
FERM 9247. 

A strain having a further improved productivity of 

the target product can be obtained using the thus 
reconstructed strain as the parent strain and further 
breeding it using the conventional mutagenesis method, the 
gene amplif ication method, the gene replacement method 
using the recombinant DNA technique, the transduction 
method or the cell fusion method. Accordingly, the 
microorganism of the present invention includes, but is not 
limited to, a mutant, a cell fusion strain, a transf ormant , 
a transductant or a recombinant strain constructed by using 
recombinant DNA. techniques , so long as it is a producing 
strain obtained via the step of accumulating at least two 
effective mutations in a corynef orm bacteria in the course 
of breeding. 

When a mutation point judged as being harmful to 
the growth or production is specified, on the other hand, 

it is examined whether or not the producing strain used at 
present contains the mutation point, When it has the 
mutation, it can be returned to the wild type gene and thus 
a further useful production strain can be bred. 

The breeding method as described above is 
applicable to microorganisms, other than corynef orm 
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bacteria, which have industrially advantageous properties 
(for example, microorganisms capable of quickly utilizing 
less expensive carbon sources, microorganisms capable of 
growing at higher temperatures) - 

7 . Production and utilization of polynucleotide array 
(1) Production of" polynucleotide array 

A polynucleotide array can be produced using the 
polynucleotide or oligonucleotide of the present invention 
obtained in the above items 1 and 2 . 

Examples include a polynucleotide array comprising 
a solid support to which at least one of a polynucleotide 
comprising the nucleotide sequence represented by SEQ ID 
NOS:2 to 3501, a polynucleotide which hybridizes with the 
polynucleotide under stringent conditions, and a 
polynucleotide comprising 10 to 200 continuous nucleotides 
in the nucleotide sequence of the polynucleotide is 
adhered; and a polynucleotide array comprising a solid 
support to which at least one of a polynucleotide encoding 
a polypeptide comprising the amino acid sequence 
represented by any one of SEQ ID NOS:3502 to 7001, a 
polynucleotide which hybridizes with the polynucleotide 
under stringent conditions, and a polynucleotide comprising 
10 to 200 continuous bases in the nucleotide sequences of 
the polynucleotides is adhered. 
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Polynucleotide arrays of the present invention 
include substrates known in the art, such as a DNA chip, a 
DNA microarray and a DNA macroarray, and the like, and 
comprises a solid support and plural polynucleotides or 
fragments thereof which are adhered to the surface of the 
solid support. 

Examples of the solid support include a glass plate, 
a nylon membrane, and the like. 

The polynucleotides or fragments thereof adhered to 
the surface of the solid support can be adhered to the 
surface of the solid support using the general technique 
for preparing arrays. Namely, a method in which they are 
adhered to a chemically surf ace- treated solid support, for 
example, to which a polycation such as polylysine or the 
like has been adhered (Nat. Genet., 21: 15-19 (1999)). The 
Chemically surface-treated supports are commercially 
available and the commercially available solid product can 
be used as the solid support of the polynucleotide array 
according to the present invention. 

As the polynucleotides or oligonucleotides adhered 
to the solid support, the polynucleotides and 
oligonucleotides of the present invention obtained in the 
above items 1 and 2 can be used. 

The analysis described below can be efficiently 
performed by adhering the polynucleotides or 
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oligonucleotides to the solid support at a high density, 
though a high fixation density is not always necessary. 

Apparatus for achieving a high fixation density, 
such as an arrayer robot or the like, is commercially 
available from Takara Shuzo (GMS417 Arrayer) , and the 
commercially available product can be used. 

Also, the oligonucleotides of the present invention 
can be synthesized directly on the solid support by the 
photolithography method or the like (27at. Genet. , 21: 20-24 
(1999) ) . In this method, a linker having a protective 
group which can be removed by light irradiation is first 
adhered to a solid support, such as a slide glass or the 
like. Then, it is irradiated with light through a mask (a 
photolithograph mask) permeating light exclusively at a 
definite part of the adhesion part* Next, an 

oligonucleotide having a protective group which can be 
removed by light irradiation is added to the part. Thus, a 
ligation reaction with the nucleotide arises exclusively at 
the irradiated part. By repeating this procedure, 
oligonucleotides, each having a desired sequence, different 
from each other can be synthesized in respective parts. 
Usually, the oligonucleotides to be synthesized have a 
length of 10 to 30 nucleotides. 
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(2) Use of polynucleotide array 

The following procedures (a) and (b) can be carried 
out using the polynucleotide array prepared in the above 
(1) . 

(a) Identification of mutation point of coryneform 
bacterium mutant and analysis of expression amount and 
expression profile of gene encoded by genome 

By subjecting a gene derived from a mutant of 
coryneform bacteria or an examined gene to the following 
steps (i) to (iv) r the mutation point of the gene can be 
identified or the expression amount and expression profile 
of the gene can be analyzed: 

(i) producing a polynucleotide array by the method of 
the above (1) ; 

(ii) incubating polynucleotides immobilized on the 
polynucleotide array together with the labeled gene derived 
from a mutant of the coryneform bacterium using the 
polynucleotide array produced in the above (i) under 
hybridization conditions; 

(iii) detecting the hybridization; and 

(iv) analyzing the hybridization data. 

The gene derived from a mutant of coryneform 
bacteria or the examined gene include a gene relating to 
biosynthesis of at least one selected from amino acids. 
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nucleic acids, vitamins , saccharides, organic acids, and 
analogues thereof. 

The method will be described in detail . 

A single nucleotide polymorphism (SNP) in a human 
region of 2,300 kb has been identified using polynucleotide 
arrays {Science, 280: 1077-82 (1998)). In accordance with 
the method of identifying SNP and methods described in 
Science , 278: 680-686 (1997) ; Brae. Natl. Acad.. Scl. USA, 
96: 12833-38 (1999); Science, 284: 1520-23 (1999), and the 
like using the polynucleotide array produced in the above 
(1) and a nucleic acid molecule (DNA, RNA) derived from 
coryneform bacteria in the method of the hybridization, a 
mutation point of a useful mutant, which is useful in 
producing an amino acid, a nucleic acid, a vitamin, a 
saccharide, an organic acid, or the like can be identified 
and the gene expression amount and the expression profile 
thereof can be analyzed. 

The nucleic acid molecule (DNA, RNA) derived from 
the coryneform bacteria can be obtained according to the 
general method described in Molecular Cloning-, 2nd ed. or 
the like. mRNA derived from Coryneba ct erlum glutamlcvm can 
also be obtained by the method of Bormann et al. (Molecular 
Microbiology, 6: 317-326 (1992)) or the like. 

Although ribosomal RNA (rRNA) is usually obtained 
in large excess in addition to the target mRNA, the 
analysis is not seriously disturbed thereby. 
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The resulting nucleic acid molecule derived from 
coryneform bacteria is labeled. Labeling can be carried 
out according to a method using a fluorescent dye, a method 
using a radioisotope or the like. 

Specific examples include a labeling method in 
which psoralen-biotin is crosslinked with KNA extracted 
from a microorganism and, after hybridization reaction , a 
fluorescent dye having strep toavidin bound thereto is bound 
to the biotin moiety {Nat. Blot&chnol. , 16: 45-48 (1998)); 
a labeling method in which a reverse transcription reaction 
is carried out using RNA extracted from a microorganism as 
a template and random primers as primers f and dUTP having a 
fluorescent dye {for example, Cy3 r Cy5) (manufactured by 
Amersham Pharmacia Biotech) is incorporated into cDNA (Proc. 
Natl. Acad. Scl. USA, 96: 12833-38 (1999)); and the like. 

The labeling specificity can be improved by 
replacing the random primers by sequences complementary to 
the 3' -end of ORF (J. Bacterlol. , 181: 6425-40 (1999)). 

In the hybridization method , the hybridization and 
subsequent washing can be carried out by the general method 
{Nat. Bloctechnol. , 14: 1675-80 (1996), or the like). 

Subsequently, the hybridization intensity is 
measured depending on the hybridization amount of the 
nucleic acid molecule used in the labeling. Thus, the 
mutation point can be identified and the expression amount 
of the gene can be calculated. 
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The hybridization intensity can be measured by 
visualizing the fluorescent signal, radioactivity, 
luminescence dose, and the like, using a laser confocal 
microscope, a CCD camera, a radiation imaging device (for 
example, STORM manufactured by Amersham Pharmacia Biotech) f 
and the like, and then quantifying the thus visualized data. 

A polynucleotide array on a solid support can also 
be analyzed and quantified using a commercially available 
apparatus, such as GMS418 Array Scanner (manufactured by 
Takara Shuzo) or the like. 

The gene expression amount can be analyzed using a 
commercially available software (for example, ImaGene 
manufactured by Takara Shuzo; Array Gauge manufactured by 
Fuji Photo Film; ImageQuant manufactured by Amersham 
Pharmacia Biotech, or the like) . 

A fluctuation in the expression amount of a 
specific gene can be monitored using a nucleic acid 
molecule obtained in the time course of culture as the 
nucleic acid molecule derived from coryneform bacteria. 
The culture conditions can be optimized by analyzing the 
fluctuation . 

The expression profile of the microorganism at the 
total gene level (namely/ which genes among a great number 
of genes encoded by the genome have been expressed and the 
expression ratio thereof) can be determined using a nucleic 
acid molecule having the sequences of many genes determined 
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from the full genome sequence of the microorganism. Thus f 
the expression amount of the genes determined by the full 
genome sequence can be analyzed and, in its turn, the 
biological conditions of the microorganism can be 
recognized as the expression pattern at the full gene level. 

(b) Confirmation of the presence of gene homologous to 
examined gene in coryneform bacteria 

Whether or not a gene homologous to the examined 
gene, which is present in an organism other than coryneform 
bacteria , is present in coryneform bacteria can be detected 
using the polynucleotide array prepared in the above (1) . 

This detection can be carried out by a method in 
which an examined gene which is present in an organism 
other than coryneform bacteria is used instead of the 
nucleic acid molecule derived from coryneform bacteria used 
in the above identification/analysis method of (1) . 

8 . Recording medium storing full genome nucleotide sequence 
and ORF data and being readable by a computer and methods 
for using the same 

The term "recording medium or storage device which 
is readable by a computer" means a recording medium or 
storage medium which can be directly readout and accessed 
with a computer. Examples include magnetic recording media, 
such as a floppy disk, a hard disk, a magnetic tape, and 
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the like; optical recording media , such as CD-ROM, CD-R, 
CD-RW, DVD-ROM, DVD-RAM, DVD-RW, and the like; electric 
recording media, such as RAM, ROM, and the like; and 
hybrids in these categories (for example, magnetic/optical 
recording media, such as MO and the like) . 

Instruments for recording or inputting in or on the 
recording medium or instruments or devices for reading out 
the information in the recording medium can be 
appropriately selected, depending on the type of the 
recording medium and the access device utilized. Also, 
various data processing programs, software, comparator and 
formats are used for recording and utilizing the 
polynucleotide sequence information or the like* Of the 
present invention in the recording medium. The information 
can be expressed in the form of a binary file, a text file 
or an ASCII file formatted with commercially available 
software, for example. Moreover f software for accessing 
the sequence information is available and known to one of 
ordinary skill in the art. 

Examples of the information to be recorded in the 
above -described medium include the full genome nucleotide 
sequence information of coryneform bacteria as obtained in 
the above item 2, the nucleotide sequence information of 
ORF, the amino acid sequence information encoded by the ORF, 
and the functional information of polynucleotides coding 
for the amino acid sequences. 
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The recording medium or storage device which is 
readable by a computer according to the present invention 
refers to a medium in which the information of the present 
invention has been recorded. Examples include recording 
media or storage devices which are readable by a computer 
storing the nucleotide sequence information represented by 
SEQ ID N0S:1 to 3501, the amino acid sequence information 
represented by SEQ id nos:3502 to 7001 , the functional 
information of the nucleotide sequences represented by SEQ 
ID N0S:1 to 3501 r the functional information of the amino 
acid sequences represented by SEQ ID NOS:3502 to 7001 , and 
the information listed in Table 1 below and the like. 

9. System based on a computer using the recording medium of 
the present invention which is readable by a computer 

The term "system based on a computer" as used 
herein refers a system composed of hardware device (s) , 
software device (s), and data recording device (s) which are 
used for analyzing the data recorded in the recording 
medium of the present invention which is readable by a 
computer ♦ 

The hardware device (s) are, for example, composed 
of an input unit, a data recording unit, a central 
processing unit and an output unit collectively or 
individually . 



- 73 - 



By the software device (s) , the data recorded in the 
recording medium of the present invention are searched or 
analyzed using the recorded data and the hardware device (s) 
as described herein. Specif ically, the software device (s) 
contain at least one program which acts on or with the 
system in order to screen , analyze or compare biologically 
meaningful structures or information from the nucleotide 
sequences, amino acid sequences and the like recorded in 
the recording medium according to the present invention. 

Examples of the software device (s) for identifying 
ORF and EMF domains include GeneMark (Nuc. Acids. Res. , 
22: 4756-67 (1994)), GeneHacker (Protein, Nucleic Acid and 
Enzyme, 42: 3001-07 (1997)), Glimmer (The Institute of 
Genomic Research; Nuc. Acids. Res., 26: 544-548 (1998)) and 
the like. in the process of using such a software device, 
the default (initial setting) parameters are usually used, 
although the parameters can be changed, if necessary, in a 
manner known to one of ordinary skill in the art. 

Examples of the software device (s) for identifying 
a genome domain or a polypeptide domain analogous to the 
target sequence or the target structural motif (homology 
searching) include FASTA, BLAST, Smith -Waterman, GenetyxMac 
(manufactured by Software Development) , GCG Package 
(manufactured by Genetic Computer Group) , GenCore 
(manufactured by Compugen) , and the like. in the process 
of using such a software device, the default (initial 
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setting) parameters are usually used, although the 
parameters can be changed, if necessary, in a manner known 
to one of ordinary skill in the art. 

Such a recording medium storing the full genome 
sequence data is useful in preparing a polynucleotide array 
by which the expression amount of a gene encoded by the 
genome DNA of coryneform bacteria and the expression 
profile at the total gene level of the microorganism, 
namely, which genes among many genes encoded by the genome 
have been expressed and the expression ratio thereof, can 
be determined. 

The data recording device (s) provided by the 
present invention are, for example, memory device (s) for 
recording the data recorded in the recording medium of the 
present invention and target sequence or target structural 
motif data, or the like, and a memory accessing device (s) 
for accessing the same. 

Namely, the system based on a computer according to 
the present invention comprises the following: 
(i) a user input device that inputs the information 

stored in the recording medium of the present invention, 
and target sequence or target structure motif information; 
<ii) a data storage device for at least temporarily 
storing the input information; 

(iii) a comparator that compares the information stored 
in the recording medium of the present invention with the 
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target sequence or target structure motif information, 
recorded by the data storing device of (ii) for screening 
and analyzing nucleotide sequence information which is 

coincident with or analogous to the target sequence or 

target structure motif information; and 

(iv) an output device that shows a screening or 
analyzing result obtained by the comparator. 

This system is usable in the methods in items 2 to 
5 as described above for searching and analyzing the ORP 
and EMF domains, target sequence f target structural motif, 
etc- of a coryneform bacterium, searching homologs, 
searching and analyzing isozymes, determining the 
biosynthesis pathway and the signal transmission pathway, 
and identifying spots which have been found, in the proteome 
analysis. The term "homologs" as used herein includes both 

of orthologs and paralogs . 

10 . Production of polypeptide using CRT derived from 
coryneform bacteria 

The polypeptide of the present invention can be 
produced using a polynucleotide comprising the OKF obtained 
in the above item 2. Specifically, the polypeptide of the 
present invention can be produced by expressing the 
polynucleotide of the present invention or a fragment 
thereof in a host cell, ' using the method described in 
Molecular Cloning, 2nd ed-. , Current Protocols In Molecular 
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Biology, and the like, for example, according to the 
following method* 

A DNA fragment having a suitable length containing 
a part encoding the polypeptide is prepared from the full 
length ORF sequence, if necessary. 

Also, DNA in which nucleotides in a nucleotide 
sequence at a part encoding the polypeptide of the present 
invention are replaced to give a codon suitable for 
expression of the host cell, if necessary. The DNA is 
useful for efficiently producing the polypeptide of the 
present invention. 

A recombinant vector is prepared by inserting the 
DNA fragment into the downstream of a promoter in a 
suitable expression vector. 

The recombinant vector is introduced to a host cell 
suitable for the expression vector. 

Any of bacteria, yeasts, animal cells, insect cells, 
plant cells, and the like can be used as the host cell so 
long as it can be expressed in the gene of interest. 

Examples of the expression vector include those 
which can replicate autonomously in the above -de scribed 
host cell or can be integrated into chromosome and have a 
promoter at such a position that the DNA encoding the 
polypeptide of the present invention can be transcribed. 

When a procaryote cell , such as a bacterium or the 
like, is used as the host cell, it is preferred that the 
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recombinant vector containing the DNA encoding the 
polypeptide of the present invention can replicate 
autonomously in the bacterium and is a recombinant vector 
constituted by, at least a promoter, a ribosome binding 
sequence, the DNA of the present invention and a 
transcription termination sequence. A promoter controlling 
gene can also be contained therewith in operable 
combination. 

Examples of the expression vectors include a vector 
plasmid which is repli cable in Corynebactexrlum glut ami cum, 
such as pCGl (Japanese Published Unexamined Patent 
Application No. 134500/82) , pCG2 (Japanese Published 
Unexamined Patent Application No. 35197/83) , pCG4 (Japanese 
Published Unexamined Patent Application No. 183799/82) , 
pCGll (Japanese Published Unexamined Patent Application No. 
134500/82), pCGH6, pCE54 and pCBlOl (Japanese Published 
Unexamined Patent Application No. 105999/83), pCE51, pCE52 
and pCE53 (Mai. Gen. Genet., 196: 175-178 (1984)), and the 
like; a vector plasmid which is replicable in Escherichia, 
coll, such as pET3 and pETll (manufactured by Stratagene) , 
pBAD , pThioHis and pTrcHis (manufactured by Invitrogen) , 
pKK223-3 and pGEX2T (manufactured by Amersham Pharmacia 
Biotech), and the like; and pBTrp2 , pBTacl and pBTac2 
(manufactured by Boehringer Mannheim Co.), pSE280 
(manufactured by Invitrogen) , pGEMEX-1 (manufactured by 
Promega) , pQE-8 (manufactured by QIAGEN) , pKYPIO (Japanese 
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Publ i s hed Unexami ned Pa tent Appl i ca t i on No . 110600/83), 
pKYP200 (Agric. Biol. Chem. , 48: 669 (1984)), pLSAl (Agric. 
Biol. Chem., 53: 211 (1989)), pGELl (Proa. Natl. Acad. Scl. 
USA, 82: 4306 (1985)), pBluescript II SK(-) (manufactured 
by Stratagene) , pTrs30 (prepared from Escherichia, coll 
JM109/pTrS30 (PERM BP-5407) ) , pTrs32 (prepared from 
Escherichia coll JM109/pTrS32 (FERM BP-5408) ) , pGHA2 
(prepared f rom Escherichia coli I GHA2 ( FERM B- 4 0 0 ) , 
Japanese Published Unexamined Patent Application No. 
221091/85) , pGKA2 (prepared from Escherichia coli IGKA2 
(FERM BP-6798) , Japanese Published Unexamined Patent 
Application No, 221091/85), pTerm2 (U.S. Patents 4,686,191, 
4,939,094 and 5,160,735), pSupex, pUBHO, pTP5 , pC194 and 
pEG400 («J. Bacterid., 172: 2392 (1990)), pGEX 

(manufactured by Pharmacia) , pET system (manufactured by 
Novagen) , and the like. 

Any promoter can be used so long as it can function 
in the host cell* Examples include promoters derived from 
Escherichia coll , phage and the like, such as tzp promoter 
(P^) , lac promoter, P L promoter, P R promoter, T7 promoter 
and the like. Also, artificially designed and modified 
promoters, such as a promoter in which two Ptrp are linked 
in series (P txp x2) , tac promoter, lacTl promoter letl 
promoter and the like, can be used. 

It is preferred to use a plasmid in which the space 
between Shine-Dalgarno sequence which is the ribosome 
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binding sequence and the initiation codon is adjusted to an 
appropriate distance (for example, 6 to 18 nucleotides) . 

The transcription termination sequence is not 
always necessary for the expression of the DNA of the 
present invention. However, it is preferred to arrange the 
transcription terminating sequence at just downstream of 
the structural gene. 

One of ordinary skill in the art will appreciate 
that the codons of the above-described elements may be 
optimized, in a known manner , depending on the host cells 
and environmental conditions utilized. 

Examples of the host cell include microorganisms 
belonging to the genus Escherichia., the genus Serra fcia, the 
genus Bacillus, the genus Brevibacterivm , the genus 
Coryneba c teri um , the genus Ml croba c teri um , the genus 
Bseudomonas , and the like. Specific examples include 
Escherichia coll XLl-Blue, Escherichia coll XL2-Blue, 
Escherichia coll DH1 , Escherichia coli MC1000, Escherichia 
coll KY3276, Escherichia coli W1485, Escherichia coli JM109, 
Escherichia coli HB101, Escherichia coli No. 49, 
Escherichia coli W3110, Escherichia coli NY4 9 , Escherichia 
coli GI698 , Escherichia coli TBI , Sejrjratia ficaria, 
Serratia fonticola, Serratia liquefaciens, Serratia 
marcescens , Bacillus subtilis , Bacillus amyloliquefaciens, 
Corynebacterium ammoniagenes , Brevibacterium immariophilum 
ATCC 14068, Brevlbacterium saccharolyticum ATCC 14066, 
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Corynehaczterl urn glutamiaum ATCC 13032 , Corynekacterium 
glutamicum ATCC 13869, Corynebacterium glntamlavm ATCC 
14067 (prior genus and species: Br&r±b3.ct&2r±vm jflaTOza) , 
Corynefeacterium glutamicum ATCC 13869 (prior genus and 
species: BreTribacterium lactofezmentum, or Corynebacterium 
lactoxerazentum) , Corynebacterlvm acetoacidophzlum ATCC 
13870, Cor^roebacterium thermoam i nogenes FESM 9244, 
Microbacterium amsLonxaphilvm ATCC 15354, Pseudomonas putida, 
Pseudomonas sp. D-011G, and the like. 

When Coryn&bacterium glutamXcvtm or an analogous 
microorganism is used as a host, an EMF necessary for 
expressing the polypeptide is not always contained in the 
vector so long as the polynucleotide of the present 
invention contains an £MF , When the EMF is not contained 
in the polynucleotide, it is necessary to prepare the EMF 
separately and ligate -it so as to be in operable 
combination. Also, when a higher expression amount or 
specific expression regulation is necessary, it is 
necessary to ligate the EMF corresponding thereto so as to 
put the EMF in operable combination with the polynucleotide. 
Examples of using an externally ligated EMF are disclosed 
in Microbiology, 142: 1297-1309 (1996). 

With regard to the method for the introduction of 
the recombinant vector, any method for introducing DNA into 
the above-described host cells, such as a method in which a 
calcium ion is used IFroc. Natl. Acad. Sci. USA, 69: 2110 
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(1972) ) , a protoplast method (Japanese Published Unexamined 
Patent Application No, 2483942/88), the methods described 
in Gene, 17: 107 (1982) and Molecular £ General Genetics, 
16Qi 111 (1979) and the like, can be used. 

When yeast is used as the host cell, examples of 
the expression vector include pYES2 (manufactured by 
Invitrogen) , YEpl3 (ATCC 37115) , YEp24 (ATCC 37051) , YCpSO 
(ATCC 37419), pHS19, pHS15 , and the like. 

Any promoter can be used so long as it can be 
expressed in yeast. Examples include a promoter of a gene 
in the glycolytic pathway, such as hexose kinase and the 
like, PH05 promoter, PGK promoter, GAP promoter, ADH 
promoter, gal 1 promoter, gal 10 promoter, a heat shock 
protein promoter, MF al promoter, CUP 1 promoter, and the 
like. 

Examples of the host cell include microorganisms 
belonging to the genus Saccharomyces , the genus 
Schlzosaccharomyces r the genus Kluyveromyces , the genus 
Trichosporon, the genus Schwannlomyces , the genus Plchla f 
the genus Candida and the like. Specific examples include 
Sa.cchs.rom.yces cerevlslae f Schlzosaccharomyces pombe, 
Kluyveromyces lactls , Trlchosporon pullulans f 

Schwannlomyces alluvlus , Candida, u tills and the like. 

With regard to the method for the introduction of 
the recombinant vector, any method for introducing DNA into 
yeast, such as an electroporation method (Methods. Enzymol . , 
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194: 182 (1990) ), a spheroplast method {Proa. Natl. Acad. 
Scl. USA, 75: 1929 (1978)), a lithium acetate method (J". 
Bacteriol. , 153: 163 (1983)), a method described in Proc. 
Natl. Acad. Scl. USA, 75: 1929 (1978) and the like, can be 
used. 

When animal cells are used as the host cells, 
examples of the expression vector include pcDNA3 . 1 r 
pSinRepS and pCEP4 (manufactured by Invitorogen) , pRev-Tre 
(manufactured by Clontech) , pAxCAwt (manufactured by Takara 
Shuzo) , pcDNAI and pcDM8 (manufactured by Funakoshi) , 
pAGE 107 ( Japanese Publ i shed Unexamined Patent Appl i cation 
No. 22979/91; Cy to techno logy , 3:133 (1990)), pAS3-3 
(Japanese Published Unexamined Patent Application No. 
227075/90), pcDM8 (Nature, 329: 840 (1987)), pcDNAl/Amp 
(manufactured by Invitrogen) , pREP4 (manufactured by 
Invitrogen) , pAGE103 (J*. Blochem. , 101: 1307 (1987)), 
pAGE210 , and the like. 

Any promoter can be used so long as it can function 
in animal cells. Examples include a promoter of IE 
(immediate early) gene of cytomegalovirus (CMV) , an early 
promoter of SV4 0 , a promoter of retrovirus, a 
metallothionein promoter, a heat shock promoter, SRa 
promoter, and the like. Also, the enhancer of the IE gene 
of human CMV can be used together with the promoter. 

Examples of the host cell include human Namalwa 
cell, monkey COS cell, Chinese hamster CHO cell, HST5637 
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(Japanese Published Unexamined Patent Application No. 
299/88) , and the like. 

The method for introduction of the recombinant 
vector into animal cells is not particularly limited, so 
long as it is the general method for introducing DNA into 
animal cells, such as an electroporation method 
(Cytotechnology, 3: 133 (1990) ) , a calcium phosphate method 
(Japanese Published Unexamined Patent Application No. 
227075/90) , a lipofection method (Proc. Natl. Acad. Sex. 
USA, 84, 7413 (1987)), the method described in Virology, 
52 z 456 (1973) , and the like. 

When insect cells are used as the host cells , the 
polypeptide can be expressed, for example, by the method 
described in Bacurorirus Express Ian Vectors, A iaJbojra izory 
Manual, W,H, Freeman and Company, New York (1992) , 
Bio/Technology, 6: 47 (1988) , or the like. 

Specifically, a recombinant gene transfer vector 
and bacurovirus are simultaneously inserted into insect 
cells to obtain a recombinant virus in an insect cell 
culture supernatant, and then the insect cells are infected 
with the resulting recombinant virus to express the 
polypeptide , 

Examples of the gene introducing vector used in the 
method include pBlueBac4.5, pVL1392, pVL1393 and 
pBlueBacIII (manufactured by Invitrogen) , and the like. 
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Examples of the bacurovirus include Autographa 
californica nuclear polyhedrosis virus with which insects 
of* the family Ba.ra.th.ra are infected, and the like. 

Examples of the insect cells include Spodoptera 
frugiperda oocytes Sf9 and Sf21 (Bacurovirus Expression 
Vectors, A Laboratory Manual, W . H . Freeman and Company, New 
York (1992) ) , Trichoplusia ni oocyte High 5 (manufactured 
by Invitrogen) and the like. 

The method for simultaneously incorporating the 
above-described recombinant gene transfer vector and the 
above-described bacurovirus for the preparation of the 
recombinant virus include calcium phosphate method 
(Japanese Published Unexamined Patent Application No . 
227075/90) , lipofection method (Proc. Natl. Acad. Sci. USA, 
84: 7413 (1987)) and the like. 

When plant cells are used as the host cells, 
examples of expression vector include a Ti plasmid, a 
tobacco mosaic virus vector , and the like. 

Any promoter can be used so long as it can be 
expressed in plant cells. Examples include 35S promoter of 
cauliflower mosaic virus (CaMV) , rice actin 1 promoter, and 
the like. 

Examples of the host cells include plant cells and 
the like, such as tobacco, potato, tomato, carrot, soybean, 
rape, alfalfa, rice, wheat, barley, and the like. 
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The method for introducing the recombinant vector 
is not particularly limited , so long as it is the general 
method for introducing DNA into plant cells, such as the 
AgrohcLCterlvm method (Japanese Published Unexamined Patent 
Application No, 140885/84, Japanese Published Unexamined 
Patent Application No. 70080/85, WO 94/00977), the 
electroporation method (Japanese Published Unexamined 
Patent Application No. 251887/85) , the particle gun method 
(Japanese Patents 2606856 and 2517813) , and the like. 

The transformant of the present invention includes 
a transformant containing the polypeptide of the present 
invention per se rather than as a recombinant vector r that 
is, a transformant containing the polypeptide of the 
present invention which is integrated into a chromosome of 
the host, in addition to the transformant containing the 
above recombinant vector . 

When expressed in yeasts, animal cells, insect 
cells or plant cells, a glycopolypeptide or glycosylated 
polypeptide can be obtained. 

The polypeptide can be produced by culturing the 
thus obtained transformant of the present invention in a 
culture medium to produce and accumulate the polypeptide of 
the present invention or any polypeptide expressed under 
the control of an EMF of the present invention, and 
recovering the polypeptide from the culture. 
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Culturing of the transformant of the present 
invention in a culture medium is carried out according to 
the conventional method as used in culturing of" the host. 

When the transformant of the present invention is 
obtained using a prokaryote, such as Escherichia coll or 
the like , or a eukaryote , such as yeast or the like, as the 
host, the transformant is cultured. 

Any of a natural medium and a synthetic medium can 
be used, so long as it contains a carbon source, a nitrogen 
source, an inorganic salt and the like which can be 
assimilated by the transformant and can perform culturing 
of the transformant efficiently. 

Examples of the carbon source include those which 
can be assimilated by the transformant, such as 
carbohydrates (for example, glucose, fructose, sucrose, 
molasses containing them, starch, starch hydrolysate, and 
the like) , organic acids (for example, acetic acid, 
propionic acid, and the like) , and alcohols (for example, 
ethanol , propanol , and the like). 

Examples of the nitrogen source include ammonia, 
various ammonium salts of inorganic acids or organic acids 
(for example, ammonium chloride, ammonium sulfate, ammonium 
acetate, ammonium phosphate, and the like) , other nitrogen- 
containing compounds, peptone, meat extract, yeast extract, 
corn steep liquor, casein hydrolysate, soybean meal and 
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soybean meal hydrolysate r various fermented cells and 
hydrolysates thereof, and the like. 

Examples of inorganic salt include potassium 
dihydrogen phosphate, dipotassium hydrogen phosphate, 
magnesium phosphate, magnesium sulfate, sodium chloride, 
ferrous sulfate, manganese sulfate, copper sulfate, calcium 
carbonate, and the like. 

The culturing is carried out under aerobic 
conditions by shaking culture, submerged-aeration stirring 
culture or the like. The culturing temperature is 

preferably from 15 to 40°C, and the culturing time is 
generally from 16 hours to 7 days. The pH of the medium is 
preferably maintained at 3.0 to 9.0 during the culturing. 
The pH can be adjusted using an inorganic or organic acid, 
an alkali solution, urea, calcium carbonate, ammonia, or 
the like. 

Also, antibiotics, such as ampicillin, tetracycline, 
and the like, can be added to the medium during the 
culturing, if necessary. 

When a microorganism transformed with a recombinant 
vector containing an inducible promoter is cultured, an 
inducer can be added to the medium, if necessary. 

For example, isopropyl-p-D-thiogalactopyranoside 
(1PTG) or the like can be added to the medium when a 
microorganism transformed with a recombinant vector 
containing lac promoter is cultured, or indoleacrylic acid 
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(IAA) or the like can by added thereto when a microorganism 
transformed with an expression vector containing trp 
promoter is cultured. 

Examples of the medium used in culturing a 
transf ormant obtained using animal cells as the host cells 
include RPMI 164 0 medium ( The Journal of the American 
Medical Association, 199: 519 (1967)), Eagle's MEM medium 
(Science, 122: 501 (1952) ) , Dulbecco' s modified MEM medium 
(Virology, 8, 396 (1959)), 199 Medium (Proceeding of the 
Society for the Biological Medicine, 73:1 (1950) ) , the 
above-described media to which fetal calf serum has been 
added, and the like. 

The culturing is carried out generally at a pH of 6 
to 8 and a temperature of 30 to 40°C in the presence of 5% 
C0 2 for 1 to 7 days. 

Also, if necessary, antibiotics, such as kanamycin, 
penicillin, and the like, can be added to the medium during 
the culturing. 

Examples of the medium used in culturing a 
transf ormant obtained using insect cells as the host cells 
include TNM-FH medium (manufactured by Pharmingen) , Sf-900 
II SFM (manufactured by Life Technologies) , ExCell 400 and 
ExCell 405 (manufactured by JRH Biosciences), Grace's 
Insect Medium (Nature, 195: 788 (1962)), and the like. 

The culturing is carried out generally at a pH of 6 
to 7 and a temperature of 25 to 30°C for 1 to 5 days. 
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Additionally, antibiotics, such as gentamicin and 
the like, can be added to the medium during the culturing, 
if necessary . 

A transformant obtained by using a plant cell as 
the host cell can be used as the cell or after 
differentiating to a plant cell or organ. Examples of the 
medium used in the culturing of the transformant include 
Murashige and Skoog (MS) medium, White medium, media to 
which a plant hormone, such as auxin, cytokinine, or the 
like has been added, and the like. 

The culturing is carried out generally at a pH of 5 
to 9 and a temperature of 2 0 to 40°C for 3 to 60 days. 

Also , antibiotics , such as kanamycin , hygromycin 
and the like, can be added to the medium during the 
culturing, if necessary. 

As described above, the polypeptide can be produced 
by culturing a transformant derived from a microorganism, 
animal cell or plant cell containing a recombinant vector 
to which a DNA encoding the polypeptide of the present 
invention has been inserted according to the general 
culturing method to produce and accumulate the polypeptide, 
and recovering the polypeptide from the culture. 

The process of gene expression may include 
secretion of the encoded protein production or fusion 
protein expression and the like in accordance with the 
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methods described in Molecular Cloning, 2nd ed. , in 
addition to direct expression. 

The method for producing the polypeptide of the 
present invention includes a method of intracellular 
expression in a host cell , a method of extracellular 
secretion from a host cell, or a method of production on a 
host cell membrane outer envelope. The method can be 
selected by changing the host cell employed or the 
structure of the polypeptide produced. 

When the polypeptide of the present invention is 
produced in a host cell or on a host cell membrane outer 
envelope, the polypeptide can be positively secreted 
extracellularly according to, for example, the method of 
Paulson et al. (J". Biol, Chem. , 264: 17619 (1989)}, the 
method of Lowe et al. (Proc. Natl. Acad. Scl. USA, 86: 8227 
(1989); Genes Develop., 4: 1288 (1990)), and/or the methods 
described in Japanese Published Unexamined Patent 
Application No. 336963/93, WO 94/23021, and the like. 

Specifically, the polypeptide of the present 
invention can be positively secreted extracellularly by 
expressing it in the form that a signal peptide has been 
added to the foreground of a polypeptide containing an 
active site of the polypeptide of the present invention 
according to the recombinant DNA technique. 

Furthermore, the amount produced can be increased 
using a gene amplification system, such as by use of a 

- 91 - 



dihydrofolate reductase gene or the like according to the 
method described in Japanese Published Unexamined Patent 
Application No. 227075/90. 

Moreover, the polypeptide of the present invention 
can be produced by a transgenic animal individual 
(transgenic nonhuman animal) or plant individual 

(transgenic plant) . 

When the transformant is the animal individual or 
plant individual, the polypeptide of the present invention 
can be produced by breeding or cultivating it so as to 
produce and accumulate the polypeptide, and recovering the 
polypeptide from the animal individual or plant individual. 

Examples of the method for producing the 
polypeptide of the present invention using the animal 
individual include a method for producing the polypeptide 
of the present invention in an animal developed by 
inserting a gene according to methods known to those of 
ordinary skill in the art (American Journal of Clinical 
Nutrition, 63: 639S (1996) , American Journal of Clinical 
Nutrition, 63: 627S (1996), Bio/Technology, 9: 830 (1991)). 

In the animal individual, the polypeptide can be 
produced by breeding a transgenic nonhuman animal to which 
the DNA encoding the polypeptide of the present invention 
has been inserted to produce and accumulate the polypeptide 
in the animal, and recovering the polypeptide from the 
animal . Examples of the production and accumulation place 
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in the animal include milk (Japanese Published Unexamined 
Patent Application No. 309192/88), egg and the like of the 
animal* Any promoter can be used, so long as it can be 
expressed in the animal . Suitable examples include an ct- 
casein promoter, a fi-casein promoter, a p-lactoglobulin 
promoter, a whey acidic protein promoter, and the like, 
which are specific for mammary glandular cells. 

Examples of the method for producing the 
polypeptide of the present invention using the plant 
individual include a method for producing the polypeptide 
of the present invention by cultivating a transgenic plant 
to which the DNA encoding the protein of the present 
invention by a known method (Tissue Culture, 20 (1994) , 
Tissue Culture, 21 (1994) , Trends in Biotechnology, 15: 45 
(1997)) to produce and accumulate the polypeptide in the 
plant, and recovering the polypeptide from the plant. 

The polypeptide according to the present invention 
can also be obtained by translation In vitro. 

The polypeptide of the present invention can be 
produced by a translation system in vitro. There are, for 
example, two in vitro translation methods which may be used, 
namely, a method using RNA as a template and another method 
using DNA as a template. The template RNA includes the 
whole RNA, mRNA, an in vitro transcription product, and the 
like. The template DNA includes a plasmid containing a 
transcriptional promoter and a target gene integrated 
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therein and downstream of the initiation site, a PCR/RT-PCR 
product and the like. To select the most suitable system 
for the in vitro translation, the origin of the gene 
encoding the protein to be synthesized (prokaryotic 
cell/eucaryotic cell) , the type of the template (DNA/RNA) , 
the purpose of using the synthesized protein and the like 
should be considered. In vitro translation kits having 
various characteristics are commercially available from 
many companies (Boehringer Mannheim, Promega, Stratagene, 
or the like) , and every kit can be used in producing the 
polypeptide according to the present invention. 

Transcription/translation of a DNA nucleotide 
sequence cloned into a plasmid containing a T7 promoter can 
be carried out using an in vitro transcription/translation 
system E. coll T7 S30 Extract System for Circular DNA 
(manufactured by Promega, catalogue No. L1130) . Also, 
transcription/translation using, as a template, a linear 
prokaryotic DNA of a supercoil non-sensitive promoter, such 
as lacUV5, tac, ^PL(con) , kPL, or the like, can be carried 
out using an in vitro transcription/translation system 
E. coli S30 Extract System for Linear Templates 
(manufactured by Promega, catalogue No. L1030) . Examples 
of the linear prokaryotic DNA used as a template include a 
DNA fragment, a PCR-amplif ied DNA product, a duplicated 
oligonucleotide ligation, an in vitro transcriptional RNA, 
a prokaryotic RNA, and the like. 
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In addition to the production of the polypeptide 
according to the present invention, synthesis of a 
radioactive labeled protein, confirmation of the expression 
capability of a cloned gene, analysis of the function of 
transcriptional reaction or translation reaction, and the 
like can be carried out using this system. 

The polypeptide produced by the transformant of the 
present invention can be isolated and purified using the 
general method for isolating and purifying an enzyme. For 
example, when the polypeptide of the present invention is 
expressed as a soluble product in the host cells, the cells 
are collected by centrifugation after cultivation, 
suspended in an aqueous buffer, and disrupted using an 
ultrasonicator, a French press, a Manton Gaulin homogenizer, 
a Dynomill, or the like to obtain a cell-free extract. 
From the supernatant obtained by centrifuging the cell-free 
extract, a purified product can be obtained by the general 
method used for isolating and purifying an enzyme, for 
example, solvent extraction, salting out using ammonium 
sulfate or the like, desalting, precipitation using an 
organic solvent, anion exchange chromatography using a 
resin, such as diethyl ami noe thy 1 (DEAE) -Sepharose, DIAION 
HPA-75 (manufactured by Mitsubishi Chemical) or the like, 
cation exchange chromatography using a resin, such as S- 
Sepharose FF (manufactured by Pharmacia) or the like, 
hydrophobic chromatography using a resin, such as butyl 
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sepharose, phenyl sepharose or the like, gel filtration 
using a molecular sieve, affinity chromatography, 
chromatof ocusing, or electrophoresis, such as isoelectronic 
focusing or the like, alone or in combination thereof. 

When the polypeptide is expressed as an insoluble 
product in the host cells, the cells are collected in the 
same manner, disrupted and centrifuged to recover the 
insoluble product of the polypeptide as the precipitate 
fraction. Next, the insoluble product of the polypeptide 
is solubilized with a protein denaturing agent. The 
solubilized solution is diluted or dialyzed to lower the 
concentration of the protein denaturing agent in the 
solution. Thus, the normal configuration of the 

polypeptide is reconstituted. After the procedure, a 
purified product of the polypeptide can be obtained by a 
purification/isolation method similar to the above. 

When the polypeptide of the present invention or 
its derivative (for example, a polypeptide formed by adding 
a sugar chain thereto) is secreted out of cells, the 
polypeptide or its derivative can be collected in the 
culture supernatant. Namely, the culture supernatant is 
obtained by treating the culture medium in a treatment 
similar to the above (for example, centrif ugation) . Then, 
a purified product can be obtained from the culture medium 
using a purification/isolation method similar to the above. 
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The polypeptide obtained by the above method is 
within the scope of the polypeptide of the present 
invention, and examples include a polypeptide encoded by a 
polynucleotide comprising the nucleotide sequence selected 
from SEQ ID N0S:2 to 3431, and a polypeptide comprising an 
amino acid sequence represented by any one of SEQ ID 
NOS:3502 to 6931. 

Furthermore, a polypeptide comprising an amino acid 
sequence in which at least one amino acids is deleted, 
replaced, inserted or added in the amino acid sequence of 
the polypeptide and having substantially the same activity 
as that of the polypeptide is included in the scope of the 
present invention. The term "substantially the same 
activity as that of the polypeptide" means the same 
activity represented by the inherent function, enzyme 
activity or the like possessed by the polypeptide which has 
not been deleted, replaced, inserted or added. The 
polypeptide can be obtained using a method for introducing 
part-specific mutation (s) described in, for example, 
Molecular Cloning, 2nd ed« , Current Protocols in Molecular 
Biology, Nuc. Acids. Res., 10: 6487 (1982), Proc. Natl. 
Acad. Sci. USA, 79: 6409 (1982), Gene, 34: 315 (1985), Nuc. 
Acids. Res., 13: 4431 (1985), Proc. Natl. Acad. Sex. USA, 
82: 4 88 (1985) and the like. For example, the polypeptide 
can be obtained by introducing mutation (s) to DNA encoding 
a polypeptide having the amino acid sequence represented by 
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any one of SEQ ID NOS:3502 to 6931, The number of the 
amino acids which are deleted, replaced , inserted or added 
is not particularly limited; however, it is usually 1 to 
the order of tens, preferably 1 to 20, more preferably 1 to 
10, and most preferably 1 to 5, amino acids. 

The at least one amino acid deletion, replacement, 
insertion or addition in the amino acid sequence of the 
polypeptide of the present invention is used herein to 
refer to that at least one amino acid is deleted, replaced, 
inserted or added to at one or plural positions in the 
amino acid sequence. The deletion, replacement, insertion 
or addition may be caused in the same amino acid sequence 
simultaneously. Also, the amino acid residue replaced, 
inserted or added can be natural or non-natural . Examples 
of the natural amino acid residue include L-alanine, 
L-asparagine, L-asparatic acid, L-glutamine , L-glutamic 
acid, glycine, L-histidine, L-isoleucine , L-leucine, 
L-lysine, L -methionine , L-phenyl alanine, L-proline, 
L-serine, L-threonine, L.- tryptophan, L- tyrosine, L-valine, 
L-cysteine, and the like* 

Herein, examples of amino acid residues which are 
replaced with each other are shown below. The amino acid 
residues in the same group can be replaced with each other. 
Group A: 
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leucine, isoleucine, norleucine, valine, norvaline, 
alanine, 2-aminobutanoic acid, methionine, O-methylserine , 
t-butylglycine, t-butylalanine, cyclohexylalanine; 
Group B: 

asparatic acid, glutamic acid, isoasparatic acid, 
isoglutamic acid, 2-aminoadipic acid, 2-aminosuberic acid; 
Group C: 

asparagine , glutamine ; 

Group D: 

lysine, arginine, ornithine, 2 , 4-diaminobutanoic 
acid, 2,3-diaminopropionic acid; 
Group E: 

proline , 3-hydroxyproline , 4 -hydroxyproline ; 

Group F: 

serine, threonine, homoserine; 

Group G : 

phenylalanine , tyrosine . 

Also, in order that the resulting mutant 
polypeptide has substantially the same activity as that of 
the polypeptide which has not been mutated, it is preferred 
that the mutant polypeptide has a homology of 60% or more, 
preferably 80% or more, and particularly preferably 95% or 
more, with the polypeptide which has not been mutated, when 
calculated, for example, using default (initial setting) 
parameters by a homology searching software, such as BLAST , 
FASTA, or the like. 
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Also, the polypeptide of the present invention can 
be produced by a chemical synthesis method, such as Fmoc 
(f luorenylmethyloxycarbonyl ) method , tBoc 

(t-butyloxycarbonyl) method, or the like. It can also be 
synthesized using a peptide synthesizer manufactured by 
Advanced ChemTech, Perkin-Elmer , Pharmacia, Protein 
Technology Instrument , Synthecell -Vega , PerSeptive , 
Shimadzu Corporation , or the like. 

The transformant of the present invention can be 
used for objects other than the production of the 
polypeptide of the present invention. 

Specifically, at least one component selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide , an 
organic acid, and analogues thereof can be produced by 
culturing the transformant containing the polynucleotide or 
recombinant vector of the present invention in a medium to 
produce and accumulate at least one component selected from 
amino acids, nucleic acids, vitamins, saccharides, organic 
acids, and analogues thereof, and recovering the same from 
the medium. 

The biosynthesis pathways, decomposition pathways 
and regulatory mechanisms of physiologically active 
substances such as amino acids, nucleic acids, vitamins, 
saccharides, organic acids and analogues thereof differ 
from organism to organism. The productivity of such a 
physiologically active substance can be improved using 
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these differences, specifically by introducing a 
heterogeneous gene relating to the biosynthesis thereof. 
For example, the content of lysine, which is one of the 
essential amino acids, in a plant seed was improved by 
introducing a synthase gene derived from a bacterium (WO 
93/19190) . Also, arginine is excessively produced in a 
culture by introducing an arginine synthase gene derived 
from Escherichia, coll (Japanese Examined Patent Publication 
23750/93) . 

To produce such a physiologically active substance, 
the transformant according to the present invention can be 
cultured by the same method as employed in culturing the 
transformant for producing the polypeptide of the present 
invention as described above. Also, the physiologically 
active substance can be recovered from the culture medium 
in combination with, for example, the ion exchange resin 
method, the precipitation method and other known methods. 

Examples of methods known to one of ordinary skill 
in the art include electroporation , calcium transf ection , 
the protoplast method, the method using a phage, and the 
like, when the host is a bacterium; and microinjection, 
calcium phosphate transf ection , the positively charged 
lipid-mediated method and the method using a virus, and the 
like, when the host is a eukaryote (Molecular Cloning, 2nd 
ed. ; Spector et al., Cells/a. laboratory manual , Cold Spring 
Harbour Laboratory Press, 1998)). Examples of the host 
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include prokaryotes , lower eukaryotes (for example, yeasts), 
higher eukaryotes (for example, mammals) , and cells 
isolated therefrom. As the state of a recombinant 
polynucleotide fragment present in the host cells, it can 
be integrated into the chromosome of the host. 
Alternatively, it can be integrated into a factor (for 
example, a plasmid) having an independent replication unit 
outside the chromosome. These transf ormants are usable in 
producing the polypeptides of the present invention encoded 
by the ORF of the genome of Coryneh^cterlum glutamlcvm, the 
polynucleotides of the present invention and fragments 
thereof. Alternatively, they can be used in producing 
arbitrary polypeptides under the regulation by an EMF of 
the present invention. 

11. Preparation of antibody recognizing the polypeptide of 
the present invention 

An antibody which recognizes the polypeptide of the 
present invention, such as a polyclonal antibody, a 
monoclonal antibody, or the like, can be produced using, as 
an antigen, a purified product of the polypeptide of the 
present invention or a partial fragment polypeptide of the 
polypeptide or a peptide having a partial amino acid 
sequence of the polypeptide of the present invention. 



(1) Production of polyclonal antibody 
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A polyclonal antibody can be produced using , as an 
antigen, a purified product of the polypeptide of the 
present invention, a partial fragment polypeptide of the 
polypeptide, or a peptide having a partial amino acid 
sequence of the polypeptide of the present invention, and 
immunizing an animal with the same. 

Examples of the animal to be immunized include 
rabbits, goats, rats, mice, hamsters, chickens and the like* 

A dosage of the antigen is preferably 50 to 100 (j.g 
per animal . 

When the peptide is used as the antigen, it is 
preferably a peptide covalently bonded to a carrier protein, 
such as keyhole limpet haemocyanin, bovine thyroglobulin, 
or the like. The peptide used as the antigen can be 
synthesized by a peptide synthesizer. 

The administration of the antigen is, for example, 
carried out 3 to 10 times at the intervals of 1 or 2 weeks 
after the first administration. On the 3rd to 7th day 
after each administration, a blood sample is collected from 
the venous plexus of the eyeground, and it is confirmed 
that the serum reacts with the antigen by the enzyme 
immunoassay (Enzyme- linked Immunosorbent Assay (ELISA) , 
Igaku Shoin (1976) / Antibodies - A XaJboratory Manual, Cold 
Spring Harbor Laboratory (1988)) or the like. 

Serum is obtained from the immunized non-human 
mammal with a sufficient antibody titer against the antigen 
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used for the immunization, and the serum is isolated and 
purified to obtain a polyclonal antibody. 

Examples of the method for the isolation and 
purification include centrifugation, salting out by 40-50% 
saturated ammonium sulfate, caprylic acid precipitation 
(Antibodies, A Laboratory manual, Cold Spring Harbor 
Laboratory (1988)), or chromatography using a DEAE- 
Sepharose column, an anion exchange column, a protein A- or 
G-column, a gel filtration column, and the like, alone or 
in combination thereof, by methods known to those of 
ordinary skill in the art. 

(2) Production of monoclonal antibody 

(a) Preparation of antibody-producing cell 

A rat having a serum showing an enough antibody 
titer against a partial fragment polypeptide of the 
polypeptide of the present invention used for immunization 
is used as a supply source of an antibody-producing cell. 

On the 3rd to 7th day after the antigen substance 
is finally administered the rat showing the antibody titer, 
the spleen is excised. 

The spleen is cut to pieces in MEM medium 
(manufactured by Nissui Pharmaceutical) , loosened using a 
pair of forceps, followed by centrifugation at 1,200 rpm 
for 5 minutes, and the resulting supernatant is discarded. 
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The spleen in the precipitated fraction is treated 
with a Tris -ammonium chloride buffer (pH 7,65) for 1 to 2 
minutes to eliminate erythrocytes and washed three times 
with MEM medium, and the resulting spleen cells are used as 
antibody-producing cells, 

(b) Preparation of myeloma cells 

As myeloma cells, an established cell line obtained 
from mouse or rat is used. Examples of useful cell lines 
include those derived from a mouse , such as P3-X63Ag8-Ul 

(hereinafter referred to as "P3-U1") (Curr. Topics In 
Microbiol. Immunol., 81: 1 (1978); Europ. J. Immunol., 

6: 511 (1976)); SP2/0-Agl4 (SP-2) (Nature, 276: 269 

(1978) ): P3-X63-Ag8653 (653) (J. Immunol., 123: 1548 

(1979) ); P3-X63-Ag8 (X63) cell line (Nature, 256: 495 
(1975)), and the like, which are 8-azaguanine-resistant 
mouse (BALB/c) myeloma cell lines. These cell lines are 
subcultured in 8-azaguanine medium (medium in which, to a 
medium obtained by adding 1.5 mmol/1 glutamine, 5xl0" 5 
mol/1 2-mercaptoethanol , 10 ^g/ml gentamicin and 10% fetal 
calf serum (FCS) (manufactured by CSL) to RPMI-1640 medium 
(hereinafter referred to as the "normal medium") , 8- 
azaguanine is further added at 15 jig/ml) and cultured in 
the normal medium 3 or 4 days before cell fusion, and 2xl0 7 
or more of the cells are used for the fusion. 
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(c) Production of hybridoma 

The antibody-producing cells obtained in (a) and 
the myeloma cells obtained in (b) are washed with MEM 
medium or PBS (disodium hydrogen phosphate: 1.83 g, sodium 
dihydrogen phosphate: 0.21 g, sodium chloride: 7.65 g, 
distilled water: 1 liter, pH: 7.2) and mixed to give a 
ratio of antibody-producing cells : myeloma cells = 5:1 
to 10 : 1, followed by centrifugation at 1,200 rpm for 5 
minutes, and the supernatant is discarded. 

The cells in the resulting precipitated fraction 
were thoroughly loosened, 0.2 to 1 ml of a mixed solution 
of 2 g of polyethylene glycol-1000 (PEG-1000) , 2 ml of MEM 
medium and 0.7 ml of dimethyl sulfoxide (DMSO) per 10 8 
antibody-producing cells is added to the cells under 
stirring at 37°C, and then 1 to 2 ml of MEM medium is 
further added thereto several times at 1 to 2 minute 
intervals . 

After the addition, MEM medium is added to give a 
total amount of 50 ml. The resulting prepared solution is 
centrifuged at 900 rpm for 5 minutes, and then the 
supernatant is discarded. The cells in the resulting 
precipitated fraction were gently loosened and then gently 
suspended in 100 ml of HAT medium (the normal medium to 
which 10" 4 mol/1 hypoxanthine , 1.5xl0" 5 mol/1 thymidine and 
4xl0" 7 mol/1 aminopterin have been added) by repeated 
drawing up into and discharging from a measuring pipette. 
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The suspension is poured into a 96 well culture 
plate at 100 (jl/well and cultured at 37°C for 7 to 14 days 
in a 5% C0 2 incubator. 

After culturing, a part of the culture supernatant 
is recovered, and a hybridoma which specifically reacts 
with a partial fragment polypeptide of the polypeptide of 
the present invention is selected according to the enzyme 
immunoassay described in Antlhodl&s , A Lahoira. tory manual, 
Cold Spring Harbor Laboratory, Chapter 14 (1998) and the 
like. 

A specific example of the enzyme immunoassay is 
described below. 

The partial fragment polypeptide of the polypeptide 
of the present invention used as the antigen in the 
immunization is spread on a suitable plate, is allowed to 
react with a hybridoma culturing supernatant or a purified 
antibody obtained in (d) described below as a first 
antibody, and is further allowed to react with an anti-rat 
or anti -mouse immunoglobulin antibody labeled with an 
enzyme, a chemical luminous substance, a radioactive 
substance or the like as a second antibody for reaction 
suitable for the labeled substance. A hybridoma which 
specifically reacts with the polypeptide of the present 
invention is selected as a hybridoma capable of producing a 
monoclonal antibody of the present invention. 
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Cloning is repeated using the hybridoma twice by 
limiting dilution analysis (HT medium (a medium in which 
aminopterin has been removed from HAT medium) is firstly 
used, and the normal medium is secondly used) , and a 
hybridoma which is stable and contains a sufficient amount 
of antibody titer is selected as a hybridoma capable of 
producing a monoclonal antibody of the present invention. 

(d) Preparation of monoclonal antibody 

The monoclonal antibody-producing hybridoma cells 
obtained in (c) are injected intraperitoneally into 8- to 
10-week-old mice or nude mice treated with pristane 
(intraperitoneal administration of 0.5 ml of 

2,6,10,14-tetramethylpentadecane (pristane), followed by 2 
weeks of feeding) at 5xl0 6 to 20xl0 6 cells/animal. The 
hybridoma causes ascites tumor in 10 to 21 days. 

The ascitic fluid is collected from the mice or 
nude mice, and centrifuged to remove solid contents at 3000 
rpm for 5 minutes. 

A monoclonal antibody can be purified and isolated 
from the resulting supernatant according to the method 
similar to that used in the polyclonal antibody. 

The subclass of the antibody can be determined 
using a mouse monoclonal antibody typing kit or a rat 
monoclonal antibody typing kit. The polypeptide amount can 
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be determined by the Lowry method or by calculation based 
on the absorbance at 280 nm. 

The antibody obtained in the above is within the 
scope of the antibody of the present invention* 

The antibody can be used for the general assay 
using an antibody, such as a radioactive material labeled 
immunoassay (RIA) , competitive binding assay, an 
immuno tissue chemical staining method (ABC method, CSA 
method, etc.), immunoprecipitation, Western blotting, ELISA 
assay, and the like (An Introduction to Radioimmunoassay 
and Related Techniques, Elsevier Science (1986) ; Techniques 
In Immunocytochemlstry , Academic Press, Vol. 1 (1982), 
Vol. 2 (1983) & Vol. 3 (1985); Practice and Theory of 
Enzyme Immunoassays , Elsevier Science (1985) ; Enzyme-linked 
Immunosorbent Assay (ELISA), Igaku Shoin (1976); 
Antibodies - A Laboratory Manual, Cold Spring Harbor 
laboratory (1988) ; Monoclonal Antibody Experiment Manual, 
Kodansha Scientific (1987) ; Second Series Biochemical 
Experiment Course, Vol. 5, Immunobio chemistry Research 
Method, Tokyo Kagaku Dojin (1986)). 

The antibody of the present invention can be used 
as it is or after being labeled with a label. 

Examples of the label include radioisotope, an 
affinity label (e.g., biotin, avidin, or the like), an 
enzyme label (e.g., horseradish peroxidase, alkaline 
phosphatase, or the like), a fluorescence label (e.g., FITC, 
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rhodamine, or the like) , a label using a rhodamine atom, (J. 
Hlstochem. Cytochem. , 18: 315 (1970); Meth. Enzym. , 62: 308 
(1979) ; Immunol. , 109: 129 (1972) ; J". Immunol . , Meth. , 
13: 215 (1979)), and the like. 

Expression of the polypeptide of the present 
invention, fluctuation of the expression, the presence or 
absence of structural change of the polypeptide, and the 
presence or absence in an organism other than coryneform 
bacteria of a polypeptide corresponding to the polypeptide 
can be analyzed using the antibody or the labeled antibody 
by the above assay, or a polypeptide array or proteome 
analysis described below. 

Furthermore, the polypeptide recognized by the 
antibody can be purified by immunoaf f inity chromatography 
using the antibody of the present invention. 

12 . Production and use of polypeptide array 
(1) Production of polypeptide array 

A polypeptide array can be produced using the 
polypeptide of the present invention obtained in the above 
item 10 or the antibody of the present invention obtained 
in the above item 11. 

The polypeptide array of the present invention 
includes protein chips , and comprises a solid support and 
the polypeptide or antibody of the present invention 
adhered to the surface of the solid support. 
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Examples of the solid support include plastic such 
as polycarbonate or the like; an acrylic resin, such as 
polyacryl amide or the like; complex carbohydrates, such as 
agarose, sepharose, or the like; silica; a silica -based 
material, carbon, a metal, inorganic glass, latex beads, 
and the like. 

The polypeptides or antibodies according to the 
present invention can be adhered to the surface of the 
solid support according to the method described in 
Blotechnlques , 27: 1258-61 (1999) ; Molecular Medicine Today, 
5: 326-7 (1999) ; Handbook of Experimental Immunology, 4th 
edition, Blackwell Scientific Publications , Chapter 10 
(1986); Meth. Enzym. , 34 (1974); Advances In Experimental 
Medicine and Biology, 42 (1974); U.S. Patent 4,681,870; U.S. 
Patent 4,282,287; U.S. Patent 4,762,881, or the like. 

The analysis described herein can be efficiently 
performed by adhering the polypeptide or antibody of the 
present invention to the solid support at a high density, 
though a high fixation density is not always necessary. 

(2) Use of polypeptide array 

A polypeptide or a compound capable of binding to 
and interacting with the polypeptides of the present 
invention adhered to the array can be identified using the 
polypeptide array to which the polypeptides of the present 
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invention have been adhered thereto as described in the 
above (1) . 

Specifically, a polypeptide or a compound capable 
of binding to and interacting with the polypeptides of the 
present invention can be identified by subjecting the 
polypeptides of the present invention to the following 
steps (i) to (iv) : 

(i) preparing a polypeptide array having the 
polypeptide of the present invention adhered thereto by the 
method of the above (1) ; 

(ii) incubating the polypeptide immobilized on the 
polypeptide array together with at least one of a second 
polypeptide or compound; 

(iii) detecting any complex formed between the at least 
one of a second polypeptide or compound and the polypeptide 
immobilized on the array using, for example, a label bound 
to the at least one of a second polypeptide or compound, or 
a secondary label which specifically binds to the complex 
or to a component of the complex after unbound material has 
been removed ; and 

(iv) analyzing the detection data. 

Specific examples of the polypeptide array to which 
the polypeptide of the present invention has been adhered 
include a polypeptide array containing a solid support to 
which at least one of a polypeptide containing an amino 
acid sequence selected from SEQ ID NOS:3502 to 7001, a 
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polypeptide containing an amino acid sequence in which at 
least one amino acids is deleted, replaced, inserted or 
added in the amino acid sequence of the polypeptide and 
having substantially the same activity as that of the 
polypeptide, a polypeptide containing an amino acid 
sequence having a homology of 60% or more with the amino 
acid sequences of the polypeptide and having substantially 
the same activity as that of the polypeptides, a partial 
fragment polypeptide, and a peptide comprising an amino 
acid sequence of a part of a polypeptide. 

The amount of production of a polypeptide derived 
from coryneform bacteria can be analyzed using a 
polypeptide array to which the antibody of the present 
invention has been adhered in the above (1) . 

Specifically, the expression amount of a gene 
derived from a mutant of coryneform bacteria can be 
analyzed by subjecting the gene to the following steps (i) 
to <iv) : 

<i) preparing a polypeptide array by the method of the 

above (1) ; 

(ii) incubating the polypeptide array (the first 
antibody) together with a polypeptide derived from a mutant 
of coryneform bacteria; 

(iii) detecting the polypeptide bound to the polypeptide 
immobilized on the array using a labeled second antibody of 
the present invention; and 
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(iv) analyzing the detection data. 

Specific examples of the polypeptide array to which 
the antibody of the present invention is adhered include a 
polypeptide array comprising a solid support to which at 
least one of an antibody which recognizes a polypeptide 
comprising an amino acid sequence selected from SEQ ID 
NOS:3502 to 7001, a polypeptide comprising an amino acid 
sequence in which at least one amino acids is deleted, 
replaced , inserted or added in the amino acid sequence of 
the polypeptide and having substantially the same activity 
as that of the polypeptide, a polypeptide comprising an 
amino acid sequence having a homology of 60% or more with 
the amino acid sequences of the polypeptide and having 
substantially the same activity as that of the polypeptides, 
a partial fragment polypeptide, or a peptide comprising an 
amino acid sequence of a part of a polypeptide. 

A fluctuation in an expression amount of a specific 
polypeptide can be monitored using a polypeptide obtained 
in the time course of culture as the polypeptide derived 
from coryneform bacteria. The culturing conditions can be 
optimized by analyzing the fluctuation. 

When a polypeptide derived from a mutant of 
coryneform bacteria is used, a mutated polypeptide can be 
detected . 
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13* Identification of useful mutation in mutant by proteome 
analysis 

Usually , the proteiome is used herein to refer to a 
method wherein a polypeptide is separated by two- 
dimensional electrophoresis and the separated polypeptide 
is digested with an enzyme, followed by identification of 
the polypeptide using a mass spectrometer (MS) and 
searching a data base. 

The two dimensional electrophoresis means an 
electrophoretic method which is performed by combining two 
electrophoretic procedures having different principles. 
For example, polypeptides are separated depending on 
molecular weight in the primary electrophoresis. Next, the 
gel is rotated by 90° or 180° and the secondary 
electrophoresis is carried out depending on isoelectric 
point. Thus, various separation patterns can be achieved 
(JIS K 3600 2474) . - 

In searching the data base, the amino acid sequence 
information of the polypeptides of the present invention 
and the recording medium of the present invention provide 
for in the above items 2 and 8 can be used. 

The proteome analysis of a coryneform bacterium and 
its mutant makes it possible to identify a polypeptide 
showing a fluctuation therebetween. 

The proteome analysis of a wild type strain of 
coryneform bacteria and a production strain showing an 
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improved productivity of a target product makes it possible 
to efficiently identify a mutation protein which is useful 
in breeding for improving the productivity of a target 
product or a protein of which expression amount is 
fluctuated . 

Specifically , a wild type strain of coryneform 
bacteria and a lysine-producing strain thereof are each 
subjected to the proteome analysis. Then, a spot increased 
in the lysine-producing strain, compared with the wild type 
strain, is found and a data base is searched so that a 
polypeptide showing an increase in yield in accordance with 
an increase in the lysine productivity can be identified. 
For example, as a result of the proteome analysis on a wild 
type strain and a lysine-producing strain, the productivity 
of the catalase having the amino acid sequence represented 
by SEQ ID NO: 37 85 is increased in the lysine-producing 
mutant . 

As a result that a protein having a high expression 
level is identified by proteome analysis using the 
nucleotide sequence information and the amino acid sequence 
information, of the genome of the coryneform bacteria of 
the present invention, and a recording medium storing the 
sequences, the nucleotide sequence of the gene encoding 
this protein and the nucleotide sequence in the upstream 
thereof can be searched at the same time, and thus, a 
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nucleotide sequence having a high expression promoter can 
be efficiently selected. 

In the proteome analysis, a spot on the two- 
dimentional electrophoresis gel showing a fluctuation is 
sometimes derived from a modified protein. However, the 
modified protein can be efficiently identified using the 
recording medium storing the nucleotide sequence 
information, the amino acid sequence information, of the 
genome of coryneform bacteria, and the recording medium 
storing the sequences, according to the present invention. 

Moreover, a useful mutation point in a useful 
mutant can be easily specified by searching a nucleotide 
sequence (nucleotide sequence of promoters, ORF, or the 
like) relating to the thus identified protein using a 
recording medium storing the nucleotide sequence 
information and the amino acid sequence information, of the 
genome of coryneform bacteria of the present invention, and 
a recording medium storing the sequences and using a primer 
designed on the basis of the detected nucleotide sequence. 
As a result that the useful mutation point is specified, an 
industrially useful mutant having the useful mutation or 
other useful mutation derived therefrom can be easily bred. 

The present invention will be explained in detail 
below based on Examples. However, the present invention is 
not limited thereto. 
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Example 1 

Determination of the full nucleotide sequence of genome of 
Coryneba cterl um glutamicvun 

The full nucleotide sequence of the genome of 
Coryneha. c terl um glwtam±cum was determined based on the 
whole genome shotgun method (Science, 269: 496-512 (1995)). 
In this method , a genome library was prepared and the 
terminal sequences were determined at random. 
Subsequently, these sequences were ligated on a computer to 
cover the full genome. Specifically, the following 

procedure was carried out . 

<1) Preparation of genome DNA of Coirynehacteirlum glutamlcum 
ATCC 13032 

CoryneJbacterium glutamlcvm ATCC 13032 was cultured 
in BY medium (7 g/1 meat extract, 10 g/1 peptone, 3 g/1 
sodium chloride, 5 g/1 yeast extract, pH 7.2) containing 1% 
of glycine at 30°C overnight and the cells were collected 
by centrifugation. After washing with STE buffer (10.3% 
sucrose, 25 mmol/1 Tris hydrochloride, 25 mmol/1 EDTA, pH 
8.0), the cells were suspended in 10 ml of STE buffer 
containing 10 mg/ml lysozyme, followed by gently shaking at 
37°C for 1 hour. Then, 2 ml of 10% SDS was added thereto 
to lyse the cells, and the resultant mixture was maintained 
at 65°C for 10 minutes and then cooled to room temperature. 
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Then, 10 ml of Tris -neutral! zed phenol was added thereto, 
followed by gently shaking at room temperature for 30 
minutes and centrifugation (15,000 x g, 20 minutes, 20°C) . 
The aqueous layer was separated and subjected to extraction 
with phenol /chloroform and extraction with chloroform 
(twice) in the same manner. To the aqueous layer, 3 mol/1 
sodium acetate solution (pH 5.2) and isopropanol were added 
at 1/10 times volume and twice volume, respectively, 
followed by gently stirring to precipitate the genome DNA. 
The genome DMA was dissolved again in 3 ml of TE buffer (10 
ramol/1 Tris hydrochloride, 1 mmol/1 EDTA, pH 8.0) 
containing 0.02 mg/ml of RNase and maintained at 37°C for 
45 minutes. The extractions with phenol , phenol /chloroform 
and chloroform were carried out successively in the same 
manner as the above. The genome DNA was subjected to 
isopropanol precipitation. The thus formed genome DNA 
precipitate was washed with 70% ethanol three times, 
followed by air-drying, and dissolved in 1.25 ml of TE 
buffer to give a genome DNA solution (concentration: 0.1 
mg/ml) . 

(2) Construction of a shotgun library 

TE buffer was added to 0.01 mg of the thus prepared 
genome DNA of Coxryneba. c teri unz glutamlcum ATCC 13032 to give 
a total volume of 0.4 ml, and the mixture was treated with 
a sonicator (Yamato Power sonic Model 150) at an output of 
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20 continuously for 5 seconds to obtain fragments of 1 to 
10 3cb. The genome fragments were blunt-ended using a DNA 
blunting kit (manufactured by Takara Shuzo) and then 
fractionated by 6% polyacrylamide gel electrophoresis. 
Genome fragments of 1 to 2 kb were cut out from the gel, 
and 0.3 ml MG elution buffer (0.5 mol/1 ammonium acetate, 
10 mmol/1 magnesium acetate , 1 mmol/1 EDTA, 0.1% SDS) was 
added thereto, followed by shaking at 37°C overnight to 
elute dna. The DNA eluate was treated with 

phenol /chloroform, and then precipitated with ethanol to 
obtain a genome library insert. The total insert and 500 
ng of pUC18 Sfrzal/RAP (manufactured by Amersham Pharmacia 
Biotech) were ligated at 16°C for 40 hours. 

The ligation product was precipitated with ethanol 
and dissolved in 0.01 ml of TE buffer. The ligation 
solution (0.001 ml) was introduced into 0.04 ml of E. coll 
ELECTRO MAX DH10B (manufactured by Life Technologies) by 
the electroporation under conditions according to the 
manufacture's instructions. The mixture was spread on LB 
plate medium (LB medium (10 g/1 bactotrypton f 5 g/1 yeast 
extract, 10 g/1 sodium chloride, pH 7.0) containing 1.6% of 
agar) containing 0.1 mg/ml ampicillin, 0.1 mg/ml X-gal and 
1 mmol/1 isopropyl-p-D-thiogalactopyranoside (1PTG) and 
cultured at 37°C overnight. 

The transformant obtained from colonies formed on 
the plate medium was stationarily cultured in a 96-well 
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titer plate having 0.05 ml of LB medium containing 0,1 
mg/ml ampicillin at 37°C overnight. Then, 0.05 ml of LB 
medium containing 20% glycerol was added thereto , followed 
by stirring to obtain a glycerol stock. 

(3) Construction of cosmid library 

About 0.1 mg of the genome DNA of Coxynebacteriuzn 
glutamicvm ATCC 13032 was partially digested with Sau3Ai 
(manufactured by Takara Shuzo) and then ultracentrif uged 
(26,000 rpm, 18 hours, 20°C) under 10 to 40% sucrose 
density gradient obtained using 10% and 4 0% sucrose buffers 
(1 mol/l NaCl, 20 mmol/1 Tris hydrochloride, 5 mmol/1 EDTA, 
10% or 40% sucrose, pH 8.0). After the centrif ugation , the 
solution thus separated was fractionated into tubes at 1 ml 
in each tube. After confirming the DNA fragment length of 
each fraction by agarose gel electrophoresis, a fraction 
containing a large amount of DNA fragment of about 40 kb 
was precipitated with ethanol . 

The DNA fragment was li gated to the BaznHl site of 
super Co si (manufactured by Stratagene) in accordance with 
the manufacture's instructions. The ligation product was 
incorporated into Escherichia coli XL-l-BlueMR strain 
(manufactured by Stratagene) using Gigapack III Gold 
Packaging Extract {manufactured by Stratagene) in 
accordance with the manufacture's instructions. The 
Escherichia coli was spread on LB plate medium containing 
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0.1 mg/ml ampicillin and cultured therein at 37°C overnight 
to isolate colonies. The resulting^ colonies were 

stationarily cultured at 37°C overnight in a 96-well titer 
plate containing 0.05 ml of the LB medium containing 0,1 
mg/ml ampicillin' in each well- medium containing 20% 

glycerol (0,05 ml) was added thereto, followed by stirring 
to obtain a glycerol stock. 

(4) Determination of nucleotide sequence 
(4-1) Preparation of template 

The full nucleotide sequence of Corynebacterium 
a-Iutamicum ATCC 13032 was determined mainly based on the 
whole genome shotgun method. The template used in the 
whole genome shotgun method was prepared by the PCR method 
using the library prepared in the above (2) . 

Specifically, the clone derived from the whole 
genome shotgun library was inoculated using a replicator 
(manufactured by GEHETIX) into each well of a 96-well plate 
containing the LB medium containing 0,1 mg/ml of ampicillin 
at 0.08 ml per each well and then stationarily cultured at 
37°C overnight. 

Next, the culturing solution was transported using 
a copy plate (manufactured by Tokken) into a 96-well 
reaction plate (manufactured by PE Biosys terns) containing a 
PCR reaction solution CTaKaRa Ex Taq (manufactured by 
Takara Shuzo) ) at 0,08 ml per each well. Then, PCR was 
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carried out in accordance with the protocol by Makino et al. 
(DNA Research, 5: 1-9 (1998)) using Geneftmp PCR System 9700 
(manufactured by P£ Biosystems) to amplify the inserted 
fragment. 

The excessive primers and nucleotides were 
eliminated using a kit for purifying a PCR production 
(manufactured by Amersham Pharmacia Biotech) and the 
residue was used as the template in the se<3uencing reaction. 

Some nucleotide sequences were determined using a 
double- stranded DMA plasmid as a template. 

The double-stranded DNA plasmid as the template was 
obtained by the following method. 

The clone derived from the whole genome shotgun 
library was inoculated into a 24- or 9 6- well plate 
containing a 2x YT medium (16 g/1 bactotrypton , 10 g/1 
yeast extract, 5 g/1 sodium chloride, pH 7.0) containing 
0.05 mg/ml ampicillin at 1.5 ml per each well and then 
cultured under shaking at 37°C overnight. 

The double- stranded DNA plasmid was prepared from 
the culturing solution using an automatic plasmid preparing 
machine , KURABO PI-50 (manufactured by Kurabo Industries) 
or a multiscreen (manufactured by Millipore) in accordance 
with the protocol provided by the manufacturer. 

To purify the double- stranded DNA plasmid using the 
multiscreen, Biomek 2000 -(manufactured by Beckroan Coulter) 
or the like was employed. 
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The thus obtained double- stranded DNA plasmid was 
dissolved in water to give a concentration of about o.i 
mg/ml and used as the template in sequencing* 

(4-2) Sequencing reaction 

To 6 p.1 of a solution of ABI PRISM BigDye 
Terminator Cycle Sequencing Ready Reaction Kit 
(manufactured by PE Biosystems) , an M13 regular direction 
primer (M13-21) or an M13 reverse direction primer (M13REV) 
{dna Research, 5: 1-9 (1998) and the template prepared in 
the above (4-1) (the PGR product or the plasmid) were added 
to give 10 u.1 of a sequencing reaction solution. The 
primers and the templates were used in an amount of 1.6 
pmol and an amount of 50 to 200 ng, respectively. 

Dye terminator sequencing reaction of 45 cycles was 
carried out with GeneAmp PCR System 9700 (manufactured by 
PE Biosystems) using the reaction solution. The cycle 
parameter was determined in accordance with the 
manufacturer's instruction accompanying ABI PRISM BigDye 
Terminator Cycle Sequencing Ready Reaction Kit. The sample 
was purified using Multiscreen HV plate (manufactured by 
Millipore) according to the manufacture's instructions. The 
thus purified reaction product was precipitated with 
ethanol, followed by drying, and then stored in the dark - at 
-30°C. 
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The dry reaction product was analyzed by ABI PRISM 
377 DNA Sequencer and ABI PRISM 3700 DNA Analyzer (both 
manufactured by PE Biosys terns) each in accordance with the 
manufacture 7 s instructions , 

The data of about 50,000 sequences in total (i.e., 
about 42,000 sequences obtained using 377 DNA Sequencer and 
about 8,000 reactions obtained by 3700 DNA Analyser) were 
transferred to a server (Alpha Server 4100: manufactured by 
COMPAQ) and stored. The data of these about 50,000 
sequences corresponded to 6 times as much as the genome 
size . 

(5) Assembly 

All operations were carried out on the basis of 
UNIX platform. The analytical data were output in 

Macintosh platform using X Window System, The base call 
was carried out using phred (The University of Washington) . 
The vector sequence data was deleted using SPS Cross_Match 

(manufactured by Southwest Parallel Software) * The 
assembly was carried out using SPS phrap (manufactured by 
Southwest Parallel Software; a high-speed version of phrap 

(The University of Washington) ) . The con tig obtained by 
the assembly was analyzed using a graphical editor, consed 

(The University of Washington) . A series of the operations 
from the base call to the assembly were carried out 
simultaneously using a script phredPhrap attached to consed. 
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(6) Determination of nucleotide sequence in gap part 

Each cosmid in the cosmid library constructed in 
the above (3) was prepared by a method similar to the 
preparation of the double- stranded DNA plasmid described in 
the above (4-1) . The nucleotide sequence at the end of the 
inserted fragment of the cosmid was determined by using ABI 
PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit 
(manufactured by PE Biosys terns) according to the 
manufacture T s instructions • 

About 800 cosmid clones were sequenced at both ends 
to search a nucleotide sequence in the contig derived from 
the shotgun sequencing obtained in the above (5) coincident 
with the sequence. Thus , the linkage between respective 
cosmid clones and respective con tigs were determined and 
mutual alignment was carried out. Furthermore, the results 
were compared with the physical map of Corynebacterium 
glutamlcum ATCC 13032 (Mol. Gen. Genet., 252: 255-265 
(1996) to carrying out mapping between the cosmids and the 
contigs . 

The sequence in the region which was not covered 
with the contigs was determined by the following method. 

Clones containing sequences positioned at the ends 
of contigs were selected. Among these clones , about 1,000 
clones wherein only one end of the inserted fragment had 
been determined were selected and the sequence at the 
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opposite end of* the inserted fragment was determined. A 
shotgun library clone or a cosmid clone containing the 
sequences at the respective ends of the inserted fragment 
in two con tigs was identified, the full nucleotide sequence 
of the inserted fragment of this clone was determined, and 
thus the nucleotide sequence of the gap part was determined. 
When no shotgun library clone or cosmid clone covering the 
gap part was available, primers complementary to the end 
sequences at the two contigs were prepared and the DNA 
fragment in the gap part was amplified by PCR. Then, 
sequencing was performed by the primer walking method using 
the amplified DNA fragment as a template or by the shotgun 
method in which the sequence of a shotgun clone prepared 
from the amplified DNA fragment was determined. Thus, the 
nucleotide sequence of the domain was determined. 

In a region showing a low sequence precision, 
primers were synthesized using AUTOFINISH function and 
navigating function of consed (The University of 
Washington) and the sequence was determined by the primer 
walking method to improve the sequence precision. The thus 
determined full nucleotide sequence of the genome of 
Corynejbacteriuin glutamlcum ATCC 13032 strain is shown in 
SEQ ID NO:l. 
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(7) Identification of ORF and presumption of its function 

OSFs in the nucleotide sequence represented by SEQ 

ID NO:l were identified According to the following method. 

First, the ORF regions were determined using software for 

identi f ying ORF , i . e . , Glimmer , GeneMark and GeneMark . hmm 

on UNIX platform according to the respective manual 

attached to the software. 

Based on the data thus obtained, ORFs in the 

nucleotide sequence represented by SEQ ID NO : 1 were 

identified. 

The putative function of an ORF was determined by 
searching the homology of the identified amino acid 
sequence o£ the ORF - against an amino acid database 
consisting of protein-encoding domains derived from Swiss- 
Prot, PIR or Genpept database constituted by protein 
encoding domains derived from GenBank database , Frame 
Search (manufactured by Compugen) , or by searching the 
homology of the identified amine acid sequence of the ORF 
against an amino acid database consisting of protein- 
encoding domains derived from Swiss-Prot, PIR or Genpept 
database constituted by protein encoding domains derived 
from GenBank database, BIAST. The nucleotide sequences of 
the thus determined ORFs are shown in SEQ id nos:2 to 3501, 
and the amino acid sequences encoded by these ORFs are 
Shown in SEQ ID NOS:3502 to 7001. 
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In some cases of the sequence listings in the 
present invention , nucleotide sequences, such as TTG, TGT, 
GGT, and the like, other than ATG, are read as an 
initiating codon encoding Met. 

Also, the preferred nucleotide sequences are SEQ ID 
NOS:2 to 355 and 357 to 3501, and the preferred amino acid 
sequences are shown in SEQ ID NOS:3502 to 3855 and 3857 to 

7001 

Table 1 Shows the registration numbers in the 
aJbove-described databases of sequences which were judged as 
having the highest homology with the nucleotide sequences 
of the ORFs as the results of the homology search in the 
amino acid sequences using the honiology-searching software 
Frame Search {manufactured by Compugen) , names of the genes 
of these sequences , the functions of the genes, and the 
matched lengthy identities and analogies comgpared with 
publicly known amino acid- translation sequences. Moreover, 
the corresponding positions were confirmed via the 
alignment o£ the nucleotide sequence of an arbitrary ORF 
with the nucleotide sequence of SEQ in HO:l_ Also, the 
positions of nucleotide sequences other than the ORFs (for 
exaznple , ribosomal RNA genes , transfer RNA genes , IS 

sequences, and the like) on the genome were determined. 

Fig. 1 shows the positions of typical genes of the 
Corynebactsrlum glutamlcum ~ ATCC 13032 on the genome. 
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Function 


hypothetical membrane protein 


2,5-diketo-D-gluconic acid reductase 


5-nucleotidase precursor 


5-nucleotidase family protein 
transposase 


organic hydroperoxide detoxication 
enzyme 


ATP-dependent DNA neiicase 


glucan 1,4-alpha-glucosidase 


lipoprotein 


ABC 3 transport Tamuy or integral 
membrane protein 


iron(lll) dicitrate transport a \ r- 
biding protein 


sugar ABC transporter, peripiasmic 
sugar-binding protein 


high affinity ribose transport protein 


ribose transport ATP-Dinaing protein 
neurofilament subunit NF-180 


peptidyl-prolyt cis-trans isomerase A 
hypothetical membrane protein 


Matched 
length 
(a.a) 


CN 
CO 
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CN 


CO 
CD 


c5 « 


CD 
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CO 
CD 
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CO 
CO 
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CN 
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CD 

CO XT 
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o> CD 
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Similarity 
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50.8 
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56.1 


56.7 
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60.8 


54.1 
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56.5 


68.3 


76.7 
44 4 


CD t- 

CD CO 
CO LO 


Identity J 
(%) 


24.9 


65.4 


27.0 


27.0 


51.8 


32.7 


26.7 


28.9 


34.6 


39.2 


25.8 


30.5 


32.2 


79.9 
29.2 


Homologous gene 


Mycobacterium leprae 
MLCB1788.18 


Corynebacterium sp. ATCC 
31090 


Vibrio parahaemolyticus nutA 


Deinococcus radiodurans 
DR0505 


? 
) 

i 1 
- 1 

CO 

5 ° 
z tn 

U (C L. 

0) o o 
c x: cu 

O CO _C 

J> X Q- 


Thiobacillus ferrooxidans recG 


Saccharomyces cerevisiae 
S288C Y1R019C stal 


Erysipelothrix rhusiopathiae 
ewlA 


Streptococcus pyogenes SF370 
mtsC 


Escherichia coli K12fecE 


Thermotoga maritima MSB8 
TM0114 


Escherichia coli K12rbsC 


Bacillus subtilis 168 rbsA 


reuurny^uii men uiuo 
Mycobacterium leprae H37RV 
RV0009 ppiA 
Bacillus subtilis 168 yqgP 
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CO 
CO 
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O 

cL 

CD 
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sp:5NTD VIBPA 
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CD 
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O LO 

CO CO 
CO CO 

lo 

CM CM 
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sp:RECG_THIFE 


sp:AMYH_YEAST 


gp:ERU52850_1 


gp:AF180520_3 


sp:FECE_ECOLl 


pir:A72417 


prf:1207243B 


sp:RBSA_BACSU 


g <' a-' 

£ cl CD 

s 5 2 

j=! id. Q_ 
O- tn tf) 


it 


CO 

o> 
CD 
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CO 
T~ 


CO 
CN 
LO 


1236 


LO LO 
CD CO 
V- ^ 


1413 


438 
1278 


LO 
CD 


CD 
CO 


h- 

LO 
CO 


IT- 
CO 

CD 


1023 


CD 
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hypothetical membrane protein 
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cytochrome c biogenesis protein 


Matched 
length 
(aa) 


CM 
CO 
N- 






CD 
LO 


CD 


o 




CO 
LO 


CM 


LO 

CM 


LO 

CM 


CD 
CO 


CO 

in 

T 


CM 
r- 
CM 






o 

CM 




o 

CM 


CM 


milarity 
(%) 


YLL 


63.4 




96.0 


89.9 


68.9 




59.9 


65.4 


72.5 


52.0 


66.5 


CO 
CM 

r-- 


72.4 






65.7 




77.1 


58.3 


co 










































Identity 
(%) 


48.3 


40.9 




84.0 


65.1 


37.3 




31.1 


33.9 


41.0 


27.2 


38.8 


45.8 


CM 

5* 






30.9 




57.5 


34.6 


Homologous gene 


Mycobacterium leprae pon1 


Streptomyces coelicolor A3(2) 
whiB 




Streptomyces coelicolor A3(2) 
SCH17.10c 


Mycobacterium tuberculosis 
H37RV Rv3678c 


Escherichia coli K12shiA 




Bacillus subtilis IcfA 


Streptomyces coelicolor A3(2) 
SCJ4.28C 


Bacillus subtilisfabG 


Emericelia nidulans fluG 


Arabidopsisthaliana atg6 


Rhizobium leguminosarum nodN 


Mycobacterium tuberculosis 
H37Rv Rv3677c 






Vibrio cholerae crp 




Micrococcus luteus pdg 


Mycobacterium tuberculosis 
H37Rv Rv3673c 


db Match 


prf:2209359A 


pir:S20912 




gp:SCH17_10 


pir:G70790 


sp:SHIA ECOLI 




sp:LCFA_BACSU 


gp:SCJ4_28 


sp:FABG_BACSU 


sp:FLUG EMENI 


prf:2512386A 


sp:NODN„RH!LV 


pir:F70790 






prf.2323349A 




sp:UVEN MICLU 


pirB70790 


n 


2385 


o> 

CO 
CO 


CM 
CD 


CO 
LO 
T— 


CD 
LO 


1353 


CD 
O 

CO 


1536 


LO 
CM 
LO 


CO 
CO 

CD 


CM 
CD 


1194 


I s - 
^- 


CO 
rr 
CO 


1173 


LO 
CD 

I s - 


CO 

CO 


CM 

CD 


o 

CO 

I s - 


CO 

in 

LO 


Terminal 
(nt) 


294004 


297402 


297622 


297783 


298250 


298332 


300695 


299726 


301512 


303099 


304074 


305263 


305758 


306700 


305195 


307504 


306782 


307727 


308734 


309302 


Initial 
(nt) 


296388 


297064 


297431 


297631 


297792 


299684 


300087 


301261 


302036 


302167 


303133 


304070 


305288 


305858 


306367 


306800 


307462 


307918 


307955 


308745 


So * 
CO z 5 


3810 


3811 


CN 

CC 
O" 


3813 


3814 


LC 

CC 
CO 


to 

CC 

co 


CC 
CO 


3818 


3819 


c 

CN 
CC 
C" 


T— 

CN 
CC 


CN 
CN 
CC 

co 


3823 


CN 
CC 
O" 


LO 
1 CN 

cc 
) CO 


CC 
CN 
CC 

c" 


I s - 

CN 

CC 
> CO 


CC 
CN 

CC 

<r 


3829 


LU O 2 

GO ^ G 


r ° 


CO 


CN 

c 


l CO 
) CO 


CO 


IT 
C 


> tc 

) c 


) N 

> c 


CO 

) CO 


o 

CO 


c 

CN 


) T- 
J CN 

T 


■ CN 
J CN 

> <r 


i CO 
I CM 
) CO 


\f 

CN 

<r 


- LT 

i CN 


> <c 

\ CN 

) c 


) I s - 

1 CN 

> c" 


. cc 

1 CN 

> c 


) CD 
\ CM 
) CO 



Z3 



Function 


hypothetical protein 


serine proteinase 


epoxide hydrolase 


hypothetical membrane protein 


phosphoserine phosphatase 


hypothetical protein 


conjugal transfer region protein 




hypothetical membrane protein 


hypothetical protein 


hypothetical protein 








ATP-dependent RNA helicase 


cold shock protein 




DNA topoisomerase I 




Matched 
length 
(a.a) 


CN 
CD 


CO 
CD 
CO 


o 

CO 
CN 


CO 
LO 


r- 
CO 
CN 


CD 
CO 


CD 
CO 




CM 
CO 
CM 


O 
CM 


CD 
LO 








CD 


CO 




CD 




milarity 
(%) 


56.3 


71.0 


52.1 


77.6 


65.5 


60.2 


66.5 




63.7 


64.2 


84.8 








66.1 


88.1 




CD 
CO 




CO 








































Identity 
(%) 


30.7 


38.6 


29.6 


46.8 


29.6 


35.0 


32.9 




LO 

d 

CO 


33.8 


47.5 








33.8 


cd 
CD 




61.7 




Homologous gene 


Escherichia coli K12yeaB 


Mycobacterium tuberculosis 
H37Rv Rv3671c 


Corynebacterium sp. C12 cEH 


Mycobacterium tuberculosis 
H37Rv Rv3669 


Mycobacterium leprae 
MTCY20G9.32C. serB 


Mycobacterium tuberculosis 
H37Rv Rv3660c 


Escherichia colitrbB 




Mycobacterium tuberculosis 
H37Rv Rv3658c 


Mycobacterium tuberculosis 
H37Rv Rv3657c 


Mycobacterium tuberculosis 
H37Rv Rv3656c 








Bacillus subtilis yprA 


Arthrobacter globiformis SI55 
csp 




Mycobacterium tuberculosis 
H37Rv Rv3646c topA 




db Match 


! 

o 

o 

LU 

< 

LU 
> 

CO 


pir:H70789 


prf:2411250A 


pir:F70789 


pir:S72914 


pir:E70788 


pir:C44020 




pir:C70788 


— ! 

pir:B70788 


pir:A70788 








sp;YPRA_BACSU 


O 
O 
h- 

<, 

0_ 
(D 
O 

CL 

to 




pir:G70563 




St 


CO 

CO 

CD 


1191 


CO 

CD 

CD 


CD 


CO 

CO 

CD 


1023 


1023 


lo 

T— 

CO 


CD 
CO 


CD 
LO 


CO 
CO 
r— 


CO 
CO 


^ 


LO 

CO 


2355 


O 
CN 


LO 
CN 
CM 


2988 


t — 


Terminal 
(nt) 


310038 


311325 


311899 


312909 


313625 


316002 


317132 


316350 


317893 


318465 


318689 


319013 


318545 


319335 


319336 


322207 


321992 


325897 


326614 


Initial 
(nt) 


309370 


310135 


312891 


313457 


314590 


314980 


316110 


316964 


317078 


317920 ; 


318492 


318696 


318958 


318991 


321690 


322007 


322216 


322910 


325904 


CO 2 5 


3830 


3831 


3832 


3833 


3834 


3835 


3836 


3837 


3838 


3839 


3840 


r— 

co 

CO 


3842 


3843 


3844 


3845 


3R46 


3847 


3848 


o n < 

UJ u z 


c 

CO 

'I ^ 


x — 

CO 
CO 


CM 
CO 
CO 


CO 
CO 
CO 


CO 
CO 


ld 

CO 
CO 


tc 

CO 
CO 


r-- 

CO 
CO 


oo 

CO 
CO 


CD 
CO 
CO 


o 

I m 


CO 


CM 
CO 


CO 
CO 


■ft 

CO 


LO 
CO 


CD 
CO 


co 


CO 

co 
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3 



T3 

CD -d „ — . 

15 5^ 



is : 

CO 



0) 

to 

JO 

o 
>. 

o 

0) 

15 
>, 

d 
CD 
T3 
CO 



CO 
CD 
CM 



CN 
CD 



CN 



CD 

>» E 
o £ 

^ (0 

< ^ 
q is 



CO 
CN 



LO 
cn 



CP 

o 

IS 
o 



a. 



CO 

CD 



o 



5 2 

<d * 

CD CD 

11 
o *o 



LO 
CO 



CO 
CO 



T3 CD 
_L T3 
.2 CD 

£ u 

o >• 

if 



CN 
CD 



CD 

CO 



CO 
CD 



CJ 

>* c/) 



CO 

o o 

0 -a 

1 <D 

co 



lo 

CN 



LO 

LO 



LO 
CN 



CD 



o 



00 

o 



"ca 

JC 
Q_ 
W 
O 

.c 

°- CD 

II 

a ±j 
— c: 
o >, 

73 



O 
CO 
CN 



CD 



c 



CO 
O) 

CO 
CD 



Q 



co 

G3 
LO 



CD 

co 



CO 
CO 



; o 
CD 

CN 



CO 
00 
LO 



LO 



CO 
LO 

CM 



CD 



,C0 



o 
c 

CD 



CO 

_o 

O 
E 

o 
X 



X 

C3 



CD 



E 

CO 6 



.E? CO 



CO 



O CN 

o o 

O CN 

£= O 

*55 rr 
Q Q 



CO 

i 

CD 

< 
> 
O 

Q_ 
03 



CO ^ 



So S 



a n < 

LU VJ Z 

w z Q 



LO 

a> 
to 

CD 
CN 



co 



ID 
CO 

o 
< 

CD 

x 1 

CO 
DL 

Q 



CO 
CO 
r — 

o 
o 

LU 
< 



10 



o> 

CO 
LO 
CJ) 
CN 
CO 



CO 
CO 
LO 

CO 
CO 



CO 
CO 
CN 
CO 
CN 
CO 



LO 
CO 



O 
LO 



CO 



CO 
h- 
CD 
O 
CO 
CO 



LO 
CO 



CO 
LO 



O 



CN 



CD 
O 

in 

in 



LU 



D_ 
O 

o 
o 

£ 
< 



O 
o 

LU 

I 

O 
ZD 
—i 
01 



LU 
> 

<. 

X 

a 

< 



CN 
CO 



^3- 
o 



CO 
CO 

Tl- 

CN 
CO 
CO 



CD 
CO 



CN 

LO 
LO 
v- 
CO 
CO 



o 

o 

LO 
CO 
CO 



LO 
CO 



LO 
CO 
CO 



LO 

CO 



LO 
CO 



E ™ 

CO 
•2 CD 

5 > 

o a: 

o 

co 
^ X 



o 
o 

O CN 
O CN 
£= CN 

CO -v- 



CD 
>% 
CN 



CD 

o 



o 
LU 



O 

o 

LU 

m 
< 

LL 



CJ) 

CD 
to 



co 



CO 

co 

CO 



o 

CD 
CO 
CO 



o 

CD 
CO 



LO 

o 

< 



I- 

UJ 



CN 
CN 

o 
>- 



o 

o 

LU 

I 

—3 

LL 
LU 

> 



LO 



LO 



CN 

o 



LO 
CO 

o 



CN 
CO 

o 

CN 



o 



CD 



o 



LO 

r-- 

co 

CN 
CO 



LO 



LO 



co 

LO 



CJ) 
CD 
LO 
O 



1^- 
CN 



co 



CO 

-5- 

CO 



CD 
CO 
CD 
CO 



CO 



CO 
CO 
CO 



CD 
CO 
CO 



LO 
CD 
CO 
CO 



CO 
CD 
CO 



CO 
CD 



^1- 

CO 
CO 



LO 
CD 
CO 



CD 
CD 
CO 



CO 
CO 
CO 



CO 
CO 



CO 
CD 
CO 
CO 



CO 
CD 
CO 
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Function 




NADP-dependent alcohol 
dehydrogenase 


glucose-1 -phosphate 
thymidylyltransferase 


dTDP-4-keto-L-rhamnose reductase 


dTDP-glucose 4,6-dehydratase 


NADH dehydrogenase 


Fe-regulated protein 




■ hypothetical membrane protein 


metallopeptidase 


prolyl endopeptidase 




hypothetical membrane protein 


cell surface layer protein 


autophosphorylating protein Tyr 
kinase 


protein phosphatase 




capsular polysaccharide 
biosynthesis 


ORF 3 


lipopolysaccharide biosynthesis / 
aminotransferase 


Matched 
length 
(aa) 




CO 
CO 


LO 
CO 

CN 


OJ 
CO 
v- 


CO 

co 


CD 
O 
CN 


LO 
CN 
CO 




CO 
CN 


CD 


CO 

o 
r^- 




CO 

LO 
CN 


CO 
CD 
CO 


CO 

LO 


CN 

o 




CO 
CO 


o 

CO 


CO 
CO 










































Similarity 
(%) 




74.9 


84.9 


74.0 


83.4 


CN 
CD 


LO 
CD 
CD 




68.3 


62.5 


56.4 




46.0 


76.6 


57.2 


68.6 




65.7 


o 
to 


68.3 


Identity 
(%) 




52.2 


62.8 


49.5 


61.8 


35.4 


33.2 




37.4 


34.1 


28.4 




26.0 


50.7 


LO 

CO 

CN 


39.2 




33.0 


41.0 


37.1 


Homologous gene 




Mycobacterium tuberculosis 
H37Rv adhC 


Salmonella anatum M32 rfbA 


Streptococcus mutans rmIC 


Streptococcus mutans XC rmlB 


Thermus aquaticus HB8 nox 


Staphylococcus aureus sirA 




Mycobacterium tuberculosis 
:H37Rv Rv3630 


Streptomyces coelicolor 
SC5F2A.19c 


Sphingomonas capsulata 




Streptomyces coelicolor A3(2) 


Corynebacterium 
ammoniagenes ATCC 6872 


Acinetobacterjohnsonii ptk 


Acinetobacterjohnsonii ptp 




Staphylococcus aureus M capD 


Vibrio cholerae 


Campylobacter jejuni wlaK 


db Match 




ZD 
I— 

O 

I 

ZLZ 

Q 
< 

CL 
CO 


sp:RFBA_SALAN 


gp:D78182_5 


sp:RMLB„STRMU 


sp:NOX_THETH 


prf:2510361A 




ZD 
(_ 

O 

>■ 

I 

> 

CL 

eft 


gp:SC5F2AJ9 


prf:2502226A 1 




CO 
LL 

O 
co 
bl 


gsp:W56155 


m 

CD 
CO 

o 

CN 


prf:2404346A 




sp:CAPD_STAAU 


PRF:2109288X 


prf:2423410L 




LO 
CO 


1059 


LO 
LO 

CO 


1359 


1131 


CD 
LO 


LO 


CO 
CO 

CD 


1308 


1380 


2118 


CO 

r-- 

LO 


1092 


1095 


1434 


CO 

o 
CD 


CO 

CO 


1812 


CN 
O) 


1155 


Ternninal 
(nt) 


346110 


346961 


348098 


348952 


350313 


351370 


353637 


353749 


354599 


355849 


357237 


359762 


360814 


362057 


365257 


365852 


366838 


368643 


367701 


369801 


Initial 
(nt) 


346460 


348019 


348952 


350310 


351443 


351948 


352693 


354387 


355906 


357228 


359354 ! 


360334 


361905 


363151 


363824 


365250 


365855 


366832 


368642 


368647 


CO ^ £ 


3869 


3870 


3871 


3872 


3873 


3874 


3875 


3876 


3877 


3878 


3879 


3880 


3881 


3882 


3883 


3884 


3885 


3886 


3887 


3888 


LL! ^ ^ 
CO Z Q 


CO 
CD 
CO 


o 

co 


co 


Ol 
r-- 
co 


CO 

co 


r- 
co 


in 
r-- 
co 


CD 
r-- 
co 


CO 


CO 
CO 


CD 

r- 
co 


o 

CO 
CO 


CO 
CO 


CN 

CO 
CO 


CO 
CO 
CO 


CO 
CO 


LO 
CO 
CO 


CD 

CO 
CO 


CO 
CO 


CO 
CO 
CO 



o 



T3 

cd 



TJ 



CO 



TJ 



cu 

CO 

o 
oi 
O 
o 

E 

o 



XT 

o 



TJ 



a: 



§ 31 
5 



CO ^ 



CO ^ 



O < 

m § ^ 

CO ^ P 



TJ 

TO 
XI 

a 
o 
CO 
w 

S-8 



If 

O XI 



o 

CO 
CO 



CM 
CD 



Cl 

CO 



U 
O 
o 
_o 

-c: 

CL 
_C0 
CO 



00 



CL 
< 



CO 
CO 



CD 

o 

CO 



ca 

-tz c 
o q3 

O 43 

ta p 
o t: 

CL O 

O Cl 
CL X 

— o 



a 

C. in 

CO CO 

CO i- 

O .<0 



O 



tfl 



en ca 

o 

ta > 



a. xi 

3 o 



r-- 
CN 



CO 



CD 
C 

*E 
m 

00 
O 

o 



o 

> 

Ql 

if § 

— > CO *- 



CM 



m 
cd 
to 



CO 
CO 



(0 

c 
o 
E 
o 

c 

CO 



CD 

ir> 

CO 

CD 
CO 



CO 



CO 

CN 

CO 



oo 

CO 



CD 
CO 

o 
ca 



0) 



XI 

o 



m 
E 

CO 

JD 

CO 
CO 
Z3 

O 
CO 
CD 



O 
I— 

z 

UJ 

<' 

ct: 
3 



CO 



CO 
CO 



o 
o 

CO 

r-- 
co 



CO 
CO 



CD 
CO 



ZD 
CO 
O 
< 

OQ 

c£ 
3 



CO 
CO 

to 

CO 



co 



CO 
CO 



CO 

en 

CO 



CD 




O 




c 




<u 

3 




O" 




CD 




oo 




c: 




o 








CO 




to 




c 








CD 




CO 




CO 




to 




o 


CO 




CO 


ins 


V 

co 


TO 


CO 


+-* 





o 



E 

n 
o 

£ 

CO 
CO 

E 

CO 

CD " 
CO 
cTO 
o h- 
O < 



CD 
CO 



r-- 

CN 
CO 



in 

CO 
CO 



CO 
CO 



CD 



^3 
O 



O 

x: 



o 



to 



CO 
CN 



CO 



§ CD 
CD ^ 

XI > 

o ct: 
cj r-- 
>^co 



CD 
CO 
to 
o 

O 



CO 
CO 
CD 
CO 

co 



co 
co 
cr> 
r- 
co 



>» 

*CD 
O 
CO 



"3- 
CO 



CQ 



<0 
to 
ca 

£Z 
CD 
O) 
O 



<D 

to 

O 



CD 
00 



c 
ca 



o 
o 



CO 
CO 
CO 



CO 



C\! 
O 
CD 



to 



CD 
CO 
CO 

ca 

£Z 

o 
E 
o 

?o 

0) Jd 
CO 60 
CL CL. 



CO 

co 
to 

CD 



co 
co 



o 
CD 



a) 
3 



CO 

In 



o 
to 

LU 



CO 
CO 



CO 
CO 

co 



CM 
T 
CO 
O 
CO 
CO 



o 
o 

CO 
CO 



ay 

CO 
CO 



o 
o 



O 
O 

LU 

«' 

o 

Q 
3 



CM 
CO 

CN 

r^- 

LL 
< 
bl 
en 



CD 
co 

CO 

o 
o 
CQ 

< 



CM 
CM 
CO 



CD 



CD 

CO 

CO 
CO 
CO 



CN 
CO 
CO 
CO 
CO 
CO 



I s -- 

CO 
lO 
CO 
CO 



CO 



CO 
CD 



tn 

CD 

CN 



CO 
CO 



CD 



in 

CO 



CD 

CO 
CO 



CO 

o 

CO 
CO 



o 

CO 
CO 



CD 
o 

CO 
CO 



o 

CO 
CO 



CN 
o 



co 
o 



O 



CD 
o 



-149 



3 



Function 


dihydrolipoamide dehydrogenase 


UTP--glucose-1 -phosphate 
uridylyltransferase 


regulatory protein 


transcriptional regulator 


cytochrome b subunit 


succinate dehydrogenase 
flavoprotein 


succinate dehydrogenase subunit B 












hypothetical protein 


hypothetical protein 






tetracenomycin C transcription 
repressor 




transporter 


Matched 
length 
(a.a) 


CO 

CD 
^J- 


in 
CD 

CN 


CO 

in 




o 
CO 
CN 


CO 

O 

CD 


CO 

in 

CM 












CO 

in 

CM 


x — 

CO 






CD 




CO 
CD 


Similarity 
(%) 


100.0 


68.1 


71.9 


81.3 


67.4 


CM 
CD 


56.2 












49.8 


64.3 






53.8 




74.6 


Identity 
(%) 


99.6 


41.7 


43.8 


57.0 


34.8 


32.4 


27.5 












26.3 


32.7 






26.4 




36.1 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 ipd 


Xanthomonas campestris 


Pseudomonas aeruginosa PA01 
orfX 


Mycobacterium tuberculosis 
H37Rv Rv0465c 


Streptomyces coelicolor A3(2) 
SCM10.12c 


Bacillus subtilis sdhA 


Paenibacillus macerans sdhB 












Streptomyces coelicolor 
SCC78.05 


Escherichia coli K12yjiN 






Streptomyces glaucescens 
GLA0 tcmR 




Streptomyces fradiae T#2717 
urdJ 


db Match 


gp:CGLPD_1 


pir:JC4985 


CD 
CD 
CD 
O 

< 
Q_ 

o. 


pir: E70828 


gp:SCM10J2 


pir:A27763 


gp:BMSDHCAB_4 












gp:SCC78_5 


sp:YJIN_ECOLI 






sp:TCMR_STRGA 




CO 

CD 
CD 

CD 

U- 
< 

cL 


Ll_ 


1407 


CN 

CO 


CO 

CD 


1422 




1875 


co 

CO 


co 
CO 
CO 


CD 
CM 


CO 
CD 


CD 
CD 


CD 
CO 
CO 


LO 

CD 


1251 


O 
CM 


co 
O 
CO 


CO 

I s - 

CD 


o 

CN 


1647 


Terminal 
(nt) 


389098 


390168 


390730 


390787 


393475 


: 395513 


396262 


396650 


396932 


396411 


397825 


398222 


397232 


399579 


400017 


400341 


401150 


401253 


402796 


Initial 
(nt) 


387692 


389248 


390233 


392208 


392705 


393639 


395426 


396315 


396672 


397040 


397730 


397884 


398206 


398329 


399598 


400039 


400473 
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hypothetical protein 






phosphoserine phosphatase 


hypothetical protein 


glutamyl-tRNA reductase 
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shikimate transport protein 
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Function 


delta-aminolevulinic acid 
dehydratase 






cation-transporting P-type ATPase B 




uroporphyrinogen decarboxylase 


protoporphyrinogen IX oxidase 


glutamate-1-semialdehyde 2,1- 
aminomutase 


phosphoglycerate mutase 


hypothetical protein 


cytochrome c-type biogenesis 
protein 


hypothetical membrane protein 


cytochrome c biogenesis protein 




transcriptional regulator 


Zn/Co transport repressor 
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glycosyl transferase 


malonyl-CoA-decarboxylase 
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CD 
CD 



CN 
CN 



LO 
CD 
CM 
CD 



CD 
CM 
CD 



CD 
CD 



CD 



CD 

CO 

CD 



co 

CN 

o 



cd 

CN 
LO 



CN 
O 



CD 
CN 
O 



CM 
LO 



CD 

CM 
LO 



CO 

o 
< 

CD 

I 

CN 
CL 
LU 
X 



LO 

CD 



CD 

o 



CD 



o 
'£ 

CO 

^ LU 

^ <D 

CD CM 
CO 

CD ^ 

o I- 

o < 



a 
£ 

CO 

II 

CD CM 

■y. co 

CD ^ 
CO 

Ero 

o I- 

o < 



CD CM 
CO 

« CO 
CD 

^° 

cTO 
o f- 
o < 



CN 

CD 

o 

CO 

LL 
< 
CL 



o 



LL 

< 



CO 
CO 



LO 



CD 



CM 
CO 
CO 
CD 



o 

CO 

o 



o 

co 

LO 



CO 
CD 
CD 



co 

CD 



co 
o 



CM 
CO 

o 



CN 

CD 

o 



LL 

< 



E 

O 

"E 

CO 
CO 

.2 - 

« O 
5 °° 
CD 

^O 

cTO 

O f- 

o < 



_o 

o 

,o 

~a3 
o 
o 

co 

CD 
O 

>.CN 
£ ° 

o ■ , 

Q_ X 
CD 

-b O 

CO co 



CN 
CD 

o 

CO 

LL 
< 



LO 

O 
co 

Q_ 

cn 



CM 
CO 
O 
CD 
CD 



co 

CO 

o 



co 

CO 
LO 



CM 



CD 
CD 
CO 
CD 
CO 



CM 
CD 



CD 
CO 

o 

LO 



o 



-156 - 



CD 



JQ 



Function 


succinate-semialdehyde 
dehydrogenase (NAD(P)+) 


novel two-component regulatory 
system 


tyrosine-specific transport protein 


cation-transporting ATPase G 


hypothetical protein or 
dehydrogenase 




SOS ribosomal protein L10 


SOS ribosomal protein L7/L12 




hypothetical membrane protein 


DNA-directed RNA polymerase beta 
chain 


DNA-directed RNA polymerase beta 
chain 


hypothetical protein 




DNA-binding protein 


hypothetical protein 


Matched 
length 
(aa) 


r— 
CD 

*^ 


o 

LO 




LO 

T 

CO 


CO 
CD 




o 


o 

CO 




CO 
CO 
CN 


1180 


1332 


CO 

CO 




CN 
CO 
CN 


LO 
CM 


Similarity 
(%) 


CO 
r~ 

r~- 


38.0 


; 49.9 


64.4 


66.2 




CO 


89.2 




55.5 


90.4 


88.7 


52.0 




63.8 


57.7 


Identity 
(%) 


40.8 


32.0 


25.5 


33.2 


40.2 




I 

52.9 


72.3 




25.8 


75.4 


72.9 


39.0 




39.2 


29.3 


Homologous gene 


Escherichia coli K12 gabD 


Azospirillum brasilense carR 


Escherichia coli K12 o341#7 
tyrP 


Mycobacterium tuberculosis 
H37Rv RV1992CctpG 


Streptomyces lividans P49 




Streptomyces griseus N2-3-11 
rplJ 


Mycobacterium tuberculosis 
H37Rv RV0652 rplL 




Mycobacterium tuberculosis 
H37Rv Rv0227c 


Mycobacterium tuberculosis 
H37Rv RV0667 rpoB 


Mycobacterium tuberculosis 
H37Rv RV0668 rpoC 


Mycobacterium tuberculosis 
H37Rv Jv0166c 




Streptomyces coelicolor A3(2) 
SCJ9A.15c 


Mycobacterium tuberculosis 
H37Rv RV2908C 


db Match 


sp:GABD_ECOLl 


3 

< 
o 

m 
< 

CL 

O 


_i 
O 
o 

LU 

I 

Q_ 
Q- 
>- 

Q_ 

to 


3 
h- 
O 
> 

i 

o 

CL 
I— 

o 

to 


sp:P49_STRLI 




sp;RL10_STRGR 


3 
H 
O 
> 

I 

_j 
DC 
cL 
to 




pir:A70962 


sp:RPOB_MYCTU 


sp:RPOC_MYCTU 


GP:AF121004_1 




gp:SCJ9AJ5 


sp:YT08_MYCTU 


ORF 
(bp) 


1359 


CO 
CD 
tJ- 


1191 


1950 


1413 


CO 
O 
CO 


CO 
LO 


co 

CO 


CO 
CO 


CM 
I s - 

co 


3495 


3999 


CM 
CO 
LO 


O 
CO 


o 

CO 


CO 

CO 

r^- 


Terminal 
(nt) 


504283 


503272 


505569 


507647 


509081 


| 509696 


510510 


510974 


510989 


512507 


516407 | 


520492 


518696 


520850 


521644 


521679 


Initial 
(nt) 


502925 


503739 


504379 


505698 


507669 


509094 


509998 


510591 


511126 


511536 


512913 


516494 


519277 


520671 


520865 


522476 


Ess 

CO 2 -S 


4037 


4038 


4039 


4040 


:4041 


4042 


4043 


4044 


4045 


4046; 


4047 


4048 


4049 


4050 


4051 


4052 


SEQ 
NO. 
(DNA) 


t^- 
co 

LO 


CO 
CO 
LO 


CO 
CO 
LO 


o 

LO 


LO 


CM 
LO 


CO 
LO 


^j- 

LO 


LO 
LO 


CO 
LO 


r-~ 

rr 
LO 


CO 

^" 

LO 


LO 


o 

LO 
LO 


x — 
LO 
LO 


CM 
LO 
LO 



-1 57 



C 



-Q 



Function 


30S ribosomal protein S12 


30S ribosomal protein S7 


elongation factor G 






lipoprotein 






ferric enterobactin transport ATP- 
binding protein 


ferric enterobactin transport protein 


ferric enterobactin transport protein 


butyryl-CoA:acetate coenzyme A 
transferase 


30S ribosomal protein S10 


SOS ribosomal protein L3 




SOS ribosomal protein L4 


SOS ribosomal protein L23 




SOS ribosomal protein L2 


30S ribosomal protein S19 




Matched 
length 
(a.a) 


CN 


LO 


CO 

o 
r^- 












CO 
lO 
CN 


CD 
CN 
CO 


LO 
CO 
CO 


LO 


o 


CM 
CN 




CM 
CN 


CO 
CD 




o 

CO 
CN 


CM 
CD 




Similarity 
(%) 


97.5 


94.8 


88.9 






78.0 






83.7 


CO 
I< 


80.6 


79.3 


99.0 


89.6 




90.1 


90.6 




92.9 


98.9 




Identity 
(%) 


90.9 


CO 
\— 
CO 


71.7 






o 

CO 
LO 






56.2 


45.6 


48.1 


56.6 


84.2 


66.5 




71.2 


74.0 




80.7 


87.0 




Homologous gene 


Mycobacterium intracellulare 
rpsL 


Mycobacterium smegmatis 
LR222 rpsG 


Micrococcus luteus fusA 






Chlamydia trachomatis 






Escherichia coli K12 fepC 


Escherichia coli K12 fepG 


Escherichia coli K1 2 fepD 


Thermoanaerobacterium 
thermosaccharolyticum actA 


Planobispora rosea ATCC 
53733 rpsJ 


Mycobacterium bovis BCG rplC 




Mycobacterium bovis BCG rplD 


Mycobacterium bovis BCG rplW 




Mycobacterium bovis BCG rplB 


Mycobacterium tuberculosis 
H37Rv Rv0705 rpsS 




db Match 


t- 

o 
> 

cm 1 

T- 

co 

cL 
tn 


CO 

o 

>- 

1 

h- 

GO 

ct 

CO 


sp:EFG_MICLU 






GSP:Y37841 






_j 
o 
o 

LU 

I 

O 
CL 
LU 
Ll_ 

CL 

tn 


sp:FEPG_ECOLl 


sp:FEPD_ECOLI 


gprCTACTAGENJ 


sp:RS10_PLARO 


sp:RL3_MYCBO 




o 

CD 
O 
>- 

I 

_i 
cr 

CL 

tn 


sp:RL23_MYCBO 




LU 

_j 
o 

>• 

CN 1 
_l 
QL 

CL 
cn 


sp:RS19_MYCTU 




ORF 
(bp) 


CD 
CD 
CO 


LO 

CD 


2115 


2160 




CO 

CN 

CM 


CO 
LO 


CD 
CN 
h- 


CM 
CO 


1035 


1035 


CD 
LO 


CO 

o 

CO 


LO 
CD 


CO 

CO 


XT 
LO 
CD 


CO 

o 

CO 


r~- 
CM 
CO 


O 
CO 


CD 
CN 


LO 
CO 

CN 


Terminal 
(nt) 


523059 


523533 


526010 


523911 


526013 


526894 


527607 


528768 


528779 


529592 


530748 


532523 


533401 


534090 


533401 


534743 


535048 


534746 


535915 


536210 


535899 


Initial 
(nt) 


522694 


523069 


523896 


526070 


526156 


527121 


527759 


528040 


529570 


530626 


531782 


532008 


533099 


533437 


534087 


534090 


534746 


535072 


535076 


535935 


536183 


SEQ 
NO. 
(a.a.) 


4053 


4054 


4055 


4056 


4057 


4058 


4059 


4060 


4061 


4062 


4063 


4064 


4065 


4066 


4067 


4068 


4069 


4070 


4071 


4072 


4073 


SEQ 
NO 
(DNA) 


CO 
LO 
LO 


LO 

in 


LO 
LO 
LO 


CD 
LO 
LO 


LO 

in 


CO 
LO 

to 


CD 
LO 

in 


o 

CO 

in 


CD 
LO 


CM 
CO 
LO 


CO 
CD 

in 


CO 

in 


LO 
CD 
LO 


CD 
CD 

in 


r- 

CD 

in 


CO 
CD 
LO 


CD 
CD 
LO 


o 
r- 
m 


■v- 
LO 


CM 
h- 
LO 


CO 
LO 



-158 - 



CD 



Function 


SOS ribosomal protein L22 


30S ribosomal protein S3 


SOS ribosomal protein L16 


SOS ribosomal protein L29 


30S ribosomal protein S17 








SOS ribosomal protein L14 


50S ribosomal protein L24 


SOS ribosomal protein L5 




2,5-diketo-D-gluconic acid reductase 




formate dehydrogenase chain D 


molybdopterin-guanine dinucleotide 
biosynthesis protein 


formate dehydrogenase H or alpha 
chain 






ABC transporter ATP-binding protein 






Matched 
length 
(aa) 


o 
o 


CD 
CO 
CN 


co 


CO 


CN 
CO 








CN 
CN 


LO 
O 


CO 
CO 




o 

CD 
CN 




co 

CD 
CN 




CD 
LO 






CN 
CO 






Similarity 
(%) 


91.7 


91.2 


88.3 


88.1 


89.0 








95.1 


91.4 


92.3 




74.2 




59.7 


68.1 


53.4 






52.6 






Identity 
(%) 


CO 


YLL 


69.3 


LO 
CO 


69.5 








83.6 


76.2 


73.6 




52.3 




28.9 


37.2 


24.3 






26.9 






Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0706 rpIV 


Mycobacterium bovis BCG rpsC 


Mycobacterium bovis BCG rplP 


Mycobacterium bovis BCG rpmC 


Mycobacterium bovis BCG rpsQ 








Mycobacterium tuberculosis 
H37Rv Rv0714 rpIN 


Mycobacterium tuberculosis 
H37Rv Rv0715 rplX 


Micrococcus iuteus rplE 




Corynebacterium sp. 




Wolinella succinogenes fdhD 


Streptomyces coelicolor A3(2) 
SCGD3.29C 


Escherichia coli fdfF 






Mycobacterium tuberculosis 
H37Rv Rv1281coppD 






db Match 


sp:RL22_MYCTU 


sp;RS3_MYCBO 


O 

m 
o 
> 

co' 

Ij 
QC 
cL 

V) 


sp:RL29_MYCBO 


sp;RS17_MYCBO 








sp:RL14_MYCTU 


O 
> 

CN 

1 

LY 

Q_ 
CO 


ID 
i 

o 

LO 1 

_] 

a: 

bl 
cn 




sp:2DKG_CORSP 




sp:FDHD„WOLSU 


gp;SCGD3_29 


sp:FDHF_ECOL! 






ZD 
f- 
O 
> 

I 

CO 

O 
> 

C/) 






g£ 


o 
to 

CO 




J 


00 
CN 
CN 


CD 
r-- 

CN 


CO 
CN 


CO 
CO 


CO 
CD 
CO 


CO 
CO 
CO 


CN 

r — 
CO 


CO 
to 


1032 


r-- 
o 

CO 


CN 
O 


LO 
CD 


CO 
CO 
CO 


2133 


CO 
LO 


o 

CO 


1662 


1146 


1074 


Terminal 
(nt) 


536576 


537322 


537741 


537971 


538252 


537974 


538381 


538718 


540106 


540423 


540998 


542079 


542090 


542921 


543415 


544335 


544757 


548084 


548187 


548990 


550699 


551854 


Initial 
(nt) 


536217 


536579 


537328 


537744 


537977 


538267 


538698 


539413 


539741 


540112 


540426 


541048 


542896 


543412 


544329 


544670 


546889 


547329 


548990 


550651 


551844 


552927 


SEQ 
NO. 
(a a.) 


4074 


4075 


4076 


4077 


4078 


4079 


4080 


4081 


4082 


4083 


4084 


4085 


4086 


4087 


4088 


4089 


4090 


4091 


4092 


4093 


4094 


4095 


SEQ 
NO. 
(DNA) 


T 

LO 


LO 

h- 

LO 


CO 

LO 


LO 


CO 
LO 


CD 
LO 


o 

CO 

to 


CO 
LO 


CN 
CO 

LO 


CO 
CO 

LO 


CO 
LO 


to 

00 

LO 


CD 
CO 
to 


r-- 
co 

LO 


CO 

co 
to 


CD 
CO 

to 


o 

CD 

LO 


5) 

to 


CN 
CD 

LO 


CO 
CD 

LO 


CD 

LO 


LO 
CD 
to 



-159 - 



c 



Function 


hypothetical protein 


hypothetical protein 


30S ribosomal protein S8 


SOS ribosomal protein L6 


SOS ribosomal protein L1 8 


30S ribosomal protein S5 


50S ribosomal protein L30 


SOS ribosomal protein L15 




; methylmalonic acid semialdehyde 
dehydrogenase 




novel two-component regulatory 
system 


aldehyde dehydrogenase or betaine 
aldehyde dehydrogenase 






reductase 


2Fe2S ferredoxin 


p-cumic alcohol dehydrogenase 


hypothetical protein 


phosphoenolpyruvate synthetase 


phosphoenolpyruvate synthetase 


cytochrome P450 


Matched 
length 
(aa) 


in 
o 


o 

LO 


CM 
CO 


CO 

r-~ 


o 




in 

LO 


CO 




CO 
CM 

T— 




in 

CM 

T 


CO 






CO 

o 


i>- 

o 


LO 
CM 


o 
in 


CO 
CM 
CD 


CO 
CO 


CM 

CM 
XT 


Similarity 
(%) 


50.4 


66.7 


CO 


87.7 


90.9 


88.3 


76.4 


87.4 




68.8 




52.0 


71.5 






71.6 


66.4 


70.8 


56.0 


45.0 


66.7 


65.2 


Identity 

(%) 


24.7 


42.7 


75.8 


59.2 


67.3 


67.8 


54.6 


66.4 




46.9 




47.0 


41.7 






41.1 


47.7 


35.8 


50.0 


22.9 


38.6 


34.8 


Homologous gene 


Archaeoglobus fulgidus AF1 398 


Deinococcus radiodurans 
DR0763 


Micrococcus luteus 


Micrococcus luteus 


Micrococcus luteus rpIR 


Micrococcus luteus rpsE 


Escherichia coli K12 rpmJ 


Micrococcus luteus rplO 




Streptomyces coelicolor msdA 




Azospirillum brasilense carR 


Rhodococcus rhodochrous 
plasmid pRTL1 orfS 






Sphingomonas sp. redA2 


Rhodobacter capsulatus fdxE 


Pseudomonas putida cymB 


Aeropyrum pernix K1 APE0029 


Pyrococcus furiosus Vc1 DSM 
3638 ppsA 


Pyrococcus furiosus Vc1 DSM 
3638 ppsA 


Rhodococcus erythropolis thcB 


db Match 


pir:E69424 


gp:AE001931_13 


pir:S29885 


pir:S29886 


D 

i 

O 

I 

CO 

Ij 
bl 

t/i 


sp:RS5_MICLU 


_i 
o 
o 

LU 

o 1 

CO 

_J 
a: 

Cl 
cn 


sp:RL15_MICLU 




prf.2204281A 




GP:ABCARRA_2 


prf.2516398E 






prf:2411257B 


prf;2313248B 


gp;PPU24215_2 


PIR:H72754 


pir:JC4176 


pir:JC4176 


prf;2104333G 


ORF 
(bp) 


1182 


CO 
CO 

xr 


CO 

cn 

CO 


co 

LO 


CM 

CD 


CO 
CO 

CD 


CO 
CO 




CO 
CM 


CM 
CO 


CO 
CD 
CO 


CO 

LO 


1491 


LO 
CO 


CD 
o 

CO 


1266 


CO 
CO 




CO 
CM 


1740 


1080 


1290 


Terminal 
(nt) 


CO 

o> 

in 
LO 


554452 


555726 


556282 


556690 


557366 


557555 


558008 


556860 


558197 


558607 


560260 


559144 


560634 


562937 


561368 


562646 


562993 


564083 


563732 


565680 


566799 


Initial 
(nt) 


554129 


554919 


555331 


555749 


556289 


556734 


557373 


557565 


557588 


558517 


558969 


559805 


560634 


561368 


562632 


562633 


562963 


563736 


563871 


565471 


566759 


568088 


SEQ 
NO. 
(a.a.) 


4096 


4097 


4098 


4099 


4100 


4101 


4102 


4103 


4104 


4105 


4106 


4107 


4108 


1 

4109 


4110 


4111 


4112 


4113 


4114 


4115 


4116 


4117 


SEQ 
NO. 
(DNA) 


CD 
CD 

m 


CD 
LO 


CO 

cn 

LO 


O) 

CO 

LO 


o 
o 

CO 


o 

CO 


CM 
O 
CD 


CO 

o 

CO 


^ 

o 

CO 


LO 
O 
CO 


CD 
O 
CD 


r>- 
o 

CO 


CO 

o 

CO 


CO 

o 

CO 


o 

CO 


CD 


CM 
\ — 
CD 


CO 
CD 


T — 

CD 


m 

CD 


CD 
CD 


CD 



-160 - 



o 

c 
3 



•o 

0? . 



CO 



0) 

c 
o 



a 



CD 
CM 



T3 



CD 



O 
O 



CD 



(D 

CD 
CO 

n 

o 

CD 

_o 
o 

E 



CD 
CD 



lo 

CO 
CN 



_Q 

-o 



c 
H 



o « 
2 a 



a n < 
UJ <=> 5 

CO 2 Q 



< CD 

LL1 ^ 



CN 
a. 



CN 

CN 
CO 

CD 



CD 

LO 



CO 

CO 
c 
a) 
o 
a. 
7a 

E 

o 

CO 

o 

CO 

o 

CO 



CO 

a. 

To 

E 

o 

CO 

o 

"i 

CO 

o 

CO 



CO 



CD* 
CD 



CO 




CD 




IE 




to 




■J 




xl 




Q_ 




O 

E 




CD 




■£ 








CO 




ZJ 




E 


CO 


Thet 


T— 

</) 

Q_ 



CO 
CO 



CO 
T— 
CO 



CN 



o v 

° « 

to o_ 
o »- 

°g 

E 3 

(D CD 
-i= O 
CO CO 



m 

CO 
LO 



CN 

t: 



CO 
CD 
CO 



CO 

in 



O 
o 

Ql 
i— 
CO 



CO 



CN 

o 



co 
lo 



LO 



oo 



^3- 

CN 
CD 



LO 
CN 



LO 
CN 
CD 



CO 
c 

-S3 
"o 

Q_ 

To 

E 

o 



oD 
a3 £_ 

3 a 

c CO 
£= LO 

i- CO 

3 > 

CO . 



CD 
N~ 



CD 



CD 
CN 
CD 



CO 



a: 



CO 
XI 

E 

CD 

E 
7a 

Id 



x: 



CD 
CO 



CN 

to 



CN 



E °> 

n > 
o cr 

>*co 



E 
p 



o 

x: 



D_ 



O 

x: 
a. 
i 

5- 

t 

C 

§- 



to 

CO 
CD 
O 



CO 
CO 
CN 



CN 

co 

LO 



CO 
CO 
o 

CO 



CO 
CD 



CO 



co 
CO 
LO 



CN 



XI 

E ^ 
S a: 

_Q > 
O C£ 

o 

>*co 
S X 



CD 
CO 
CO 
O 

< 



in 
CN 



CN 
CD 
CD 
CM 
CO 
LO 



CO 
O 

r — 
CO 
LO 



Q CO 

o 

o c 

O (/) 



CO 

CN 



co 

LO 



o 

CO 



CN 



LU 



CO 



co 

CD 



O 
CN 
CD 

LO 

CO 
LO 



CD 
CN 

co 

LO 



CO 
CN 



CN 
CO 
CO 
LO 



CO 

CN 

CO 

LO 
CO 
LO 



co 

CO 



-16 1 - 



Function 


high-alkaline serine proteinase 


hypothetical membrane protein 


hypothetical membrane protein 








hypothetical protein 


early secretory antigen target ESAT- 
6 protein 


50S ribosomal protein L13 


30S ribosomal protein S9 


phosphoglucosamine mutase 




hypothetical protein 






hypothetical protein 


alanine racemase 


hypothetical protein 


■o 






































Match e 
length 


CO 
CN 


CD 

LO 


1260 








CO 

o 


o 

CO 


lO 


CO 


o 

LO 




CO 
CO 






CD 
LO 
CN 


CO 

CD 

CO 


LO 


milarity 
(%) 


58.0 


50.6 


38.4 








69.9 


81.3 


82.1 


72.4 


76.4 




45.6 






72.2 


68.5 


78.6 


CO 






































Identity 
(%) 


31.3 


24.0 


65.0 








31.1 


36.3 


58.6 


49.2 


CD 
CO 




29.3 






o 


41.6 


48.7 


Homologous gene 


Bacillus alcalophilus 


Streptomyces coelicolor A3(2) 
SC3C3.21 


Mycobacterium tuberculosis 
H37Rv Rv3447c 








Mycobacterium tuberculosis 
H37Rv Rv3445c 


Mycobacterium tuberculosis 


Streptomyces coelicolor A3(2) 
SC6G4.12. rpIM 


Streptomyces coelicolor A3(2) 
SC6G4.13. rpsi 


Staphylococcus aureus 
femR315 




Synechocystis sp. PCC6803 
slr1753 






Mycobacterium leprae 
B229F1J20 


Mycobacterium tuberculosis 
H37Rv RV3423C air 


Mycobacterium tuberculosis 
H37Rv Rv3422c 


db Match 


o 
< 
o 
< 

CO 

<■ 

>- 
_J 
LU 
cL 
tfi 


pir:T10930 


pir:E70977 








pir:C70977 


prf:2111376A 


o 
o 

H 

w , 

CO 

Zj 
LT 
o_ 
H) 


O 
O 
rr 

H- 

co 

I 

CD 

co 

Q_ 
W 


prf:2320260A 




pir:S75138 






pir:S73000 


sp:ALR_MYCTU 


sp:Y097_MYCTU 


ORF 
(bp) 


1359 


1371 


3567 


CN 
CN 
CO 


CO 
CD 
CD 


O 
O 
CD 


CN 

CO 


CO 

co 

CN 




CD 

m 


1341 


CO 

o 

CO 


1509 


CO 
LO 


CO 
CN 


LO 

LO 
CO 


1083 


LO 

CD 


Terminal 
(nt) 


586399 


587645 


592862 


589590 


589898 


593761 


594258 


594580 


595379 


595927 


597449 


598194 


599702 


598778 


599932 


600022 


602053 


602574 


Initial 
(nt) 


587757 


589015 


589296 


590411 


590560 


592862 


593935 


594293 


594939 


595382 


596109 


597892 


598194 


599350 


599699 


600876 


600971 


602080 


CO ^ ^ 


4138 


4139 


4140 


4141 


4142 


4143 


4144 


4145 


4146 


4147 


4148 


4149 


4150 


4151 


4152 


4153 


4154 


4155 


O x < 

LU O 2 
co ^ Q 


CO 
CO 
CD 


CD 
CO 
CD 


o 

CD 


5- 

CD 


CN 
CD 


CO 
CD 


CD 


LO 
CD 


CD 
CD 


CD 


CO 
CD 


CD 
CD 


o 

LO 
CO 


to 

CO 


CN 
LO 
CD 


CO 
LO 
CD 


LO 
CO 


LO 
LO 
CO 
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3 



0) 



Function 


hypothetical membrane protein 


proline iminopeptidase 


hypothetical protein 


ribosomal-protein-alanine N- 
acetyltransferase 


O-sialoglycoprotein endopeptidase 


hypothetical protein 






heat shock protein groES 


heat shock protein groEL 


hypothetical protein 


hypothetical protein 


regulatory protein 


RNA polymerase sigma factor 




hypothetical protein 


IMP dehydrogenase 


hypothetical protein 


Matched 
length 
(aa) 


o 

ID 

in 




O 
CN 


CN 
CO 


CO 


in 






o 
o 


CO 
LO 


to 
r*- 


CO 

CO 


CD 


x — 




CO 

T— 
T — 


o 
in 


CD 


Similarity 
(%) 


66.2 


77.6 


75.4 


59.9 


75.2 


59.4 






94.0 


85.1 


56.0 


45.0 


88.3 


81.6 




69.8 


93.9 


53.0 


Identity 
(%) 


28.9 


51.3 


52.2 


30.3 


46.1 


38.4 






76.0 


63.3 


50.0 


34.0 


CD 
CD 


55.2 




41.4 


80.8 


39.0 


Homologous gene 


Escherichia coli K12 yidE 


Propionibactertum shermanii pip 


Mycobacterium tuberculosis 
H37Rv Rv3421c 


Escherichia coli K12 ritnl 


Pasteurella haemolytica 
SEROTYPE A1 gcp 


Mycobacterium tuberculosis 
H37Rv Rv3433c 






Mycobacterium tuberculosis 
H37Rv RV3418C mopB 


Mycobacterium leprae 
B229_C3_248 groE1 


Mycobacterium tuberculosis 


Mycobacterium tuberculosis 


Mycobacterium smegmatis 
whiB3 


Mycobacterium tuberculosis 
H37Rv Rv3414c sigD 




Mycobacterium leprae 
B1620_F3_131 


Corynebacterium 
ammoniagenes ATCC 6872 
guaB 


Pyrococcus horikoshii PH0308 


db Match 


_i 
o 
o 

LU 
J 

Q 

Q_ 

to 


T — 

1 

CO 

o 
o 
—> 
CO 
o_ 
h_ 
cn 


3 
1— 
O 
> 

co 1 

O) 

O 
^ 

CL 
CO 


sp:RIMI_ECOLI 


sp:GCP_PASHA 


3 
f- 
O 
> 

LO' 

>- 

CL 
CO 






I— 
O 
>■ 

o 1 
X 

o 

CL 
CO 


LU 

_J 

o 
> 

I 

CO 

X 

o 

CL 
CO 


GP:MSGTCWPAJ 


GP:MSGTCWPA_3 


gp:AF073300J 


3 
I- 
O 
> 

LL 1 
cx> 
o 

ol. 
Ol 




LU 
— I 

o 
> 

I 

X 

CD 

o 

> 

CL 
CO 


gp:AB003154J 


PIR:F71456 


ORF 
(bp) 


1599 


1239 


LO 

CD 


O 
LO 


1032 


1722 


o> 

CN 


CO 

LO 


N- 
o> 
CN 


1614 


in 
in 
CN 


1158 


CO 

CN 


xr 

CO 

m 


1026 


CO 

co 


1518 


I s -* 

CM 
CD 


Terminal 
(nt) 


604409 


605708 


606392 


606898 


607936 


609679 


610175 


609816 


610644 


612272 


610946 


611109 


612418 


613719 


614747 


614803 


616853 


615605 


initial 
(nt) 


602811 


604470 


605718 


606392 


606905 


607958 


609747 


610268 


610348 


610659 


611200 


612266 


612714 


613156 


613722 


615180 


615336 


616231 


CO 2 «, 


4156 


4157 


4158 


4159 


4160 


4161 


4162 


14163 


4164 


4165 


4166 


4167 


4168 


4169 


4170 


4171 


4172 


4173 


SEQ 
NO. 
(DNA) 


CO 

LO 

CD 


I s - 

LO 

CO 


CO 

Ln 

CD 


CD 

LO 

CD 


o 

CD 
CO 


CD 
CD 


CM 
CO 
CD 


CO 
CD 
CD 


«r 

CD 
CD 


in 

CD 
CD 


CO 
CD 
CO 


CO 
CO 


CO 
CO 
CD 


CD 
CD 
CD 


o 

CO 


CD 


CM 

co 


CO 

to 
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Function 


IMP dehydrogenase 


hypothetical membrane protein 


glutamate synthetase positive 
regulator 


GMP synthetase 








hypothetical membrane protein 


two-component system sensor 
histidine kinase 


transcriptional regulator or 
extracellular proteinase response 
regulator 








hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical membrane protein 




Matched 
length 
(aa) 


CO 
CO 


CN 


CN 
CD 
CN 


in 








CO 

in 




CO 

CM 








o 

CM 


CO 
CD 

in 




LO 

CM 


CO 
CO 

CM 




Similarity 
(%) 


86.1 


67,5 


58.4 


92.8 








39.6 


48.7 


65.1 








64.2 


CD 




62.9 


58.3 




Identity 
(%) 


70.9 


38.0 


29,0 


CD 
CO 








20.5 


26.8 


33.5 








30.9 


37.5 




33.8 


27.8 




Homologous gene 


Corynebacterium 
ammoniagenes ATCC 6872 


Escherichia coli K12ybiF 


Bacillus subtilis gltC 


Corynebacterium 
ammoniagenes guaA 








Streptomyces coelicolor A3(2) 


Streptomyces coelicolor A3(2) 
SC6E10.15C 


Bacillus subtilis 168 degU 








Mycobacterium tuberculosis 
H37Rv Rv3395c 


Mycobacterium tuberculosis 
H37Rv Rv3394c 




Streptomyces coelicolor A3(2) 
SC5B8.20C 


Deinococcus radiodurans 
DR0809 




db Match 


gp:AB003154_2 


_j 
O 
o 

LU 
u. 1 
m 
>- 
cL 

CO 


prf:1516239A 


< 

O 
o 

i 

3 

a 

CO 








CM 

CO 
CD 
Q 
O 
CO 

cL 
co 


LO 

o' 

LU 
CO 

O 

CO 

CL 
CD 


3 
CO 

o 
< 

CO 

! 

3 
CD 
LU 
Q 

CL 
CO 








pir:B70975 


pir:A70975 




gp:SC5B8_20 


gp:AE001935_7 




ORF 
(bp) 


1122 


CN 
CO 


cr> 
O 

CO 


1569 


CO 
CD 
CD 


5- 


CO 
CO 


1176 


1140 


o 

CO 
CO 


CM 
CO 


CO 
CO 


CO 
CO 
CO 


LO 
CM 
CO 


1590 


o 

CO 
CD 


CD 
CO 


CD 
CO 


o 

CO 
CO 


Terminal 
(nt) 


618094 


618093 


619994 


621572 


620264 


622157 


622457 


622460 


624939 


625674 


626000 


626070 


626577 


628551 


630140 


630151 


631809 


631824 


632690 


Initial 
(nt) 


616973 


619013 


619086 


620004 


620926 


621717 


622269 


623635 


623800 


624985 


625677 


626558 


627539 


627727 


628551 


630810 


630949 


632684 


633079 


S§ 2 

CO ^ « 


4174 


4175 


4176 


4177 


4178 


4179 


4180 


4181 


4182 


4183 


4184 


4185 


4186 


4187 


4188 


4189 


4190 


4191 


4192 


° ri 5 

CO 2 Q 


CD 


LO 

r-- 

CD 


CO 
CD 


CD 


CO 

h- 

CD 


CO 
CD 


o 

CO 

CD 


CO 

CD 


CM 
CO 
CD 


CO 

CO 

CD 


CO 

CD 


in 

CO 

CO 


CO 
CO 
CD 


CO 
CD 


CO 
CO 
CO 


CO 
CO 
CO 


o 

CO 
CO 


CO 
CO 


CM 
CO 
CO 
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o 
o 



0 



Function 


hypothetical membrane protein 


phytoene desaturase 


phytoene synthase 


transmembrane transport protein 


geranylgeranyl pyrophosphate 
(GGPP) synthase 


transcriptional regulator (MarR 
family) 

i — — — — 


outer membrane lipoprotein 


hypothetical protein 


DNA photolyase 


glycosyl transferase 


ABC transporter 


ABC transporter 




ABC transporter 




ABC transporter 


poprotein 


DNA polymerase Hi 


hypothetical protein 


Matched 
length 
(aa) 


in 
a> 


CN 
LO 


CO 
CO 
CN 


CN 
CN 
r-- 


r-- 
co 

CO 


CO 
CO 


m 


CN 
CO 


h- 
*^ 


LO 

o 

CN 


CD 

cO 


CO 
CNI 
CN 




CO 

o 

CN 




CD 

co 


CO 

CO 

CN 


1101 


CD 
LO 


milarity 
(%) 


67.4 


76.2 


71.2 


75.6 


63.8 


68.1 


62.1 


74.2 


63.2 


53.7 


54.9 


72.2 




75,2 




75.4 


67.2 


57.5 


62.3 


CO 








































Identity 
(%) 


36.8 


d 

LO 


42.0 


48.6 


32.7 


38.3 


33.1 


, 48.7 


40.0 


25.9 


24.3 


35.4 




35.9 




43.6 


CO 
CN 


30.2 


41.5 


Homologous gene 


Mycobacterium marinum 


Brevibacterium linens ATCC 
9175 crtl 


Brevibacterium linens ATCC 
9175 crtB 


Streptomyces coeiicolor A3(2) 
SCF43A.29C 


Brevibacterium linens crtE 


Brevibacterium linens 


Citrobacterfreundii bic OS60 bic 


Brevibacterium linens 


Brevibacterium linens ATCC 
9175 cpdl 


Streptococcus suis cpslK 


Streptomyces coeiicolor A3(2) 
SCE25.30 


Bacillus subtilis 168yvrO 




Helicobacter pylori abcD 




Escherichia coli TAP90 abc 


Haemophilus influenzae 
SEROTYPE B hlpA 


Thermus aquaticus dnaE 


Streptomyces coeiicolor A3(2) 
SCE126.11 


db Match 


gp;MMU92075_3 


gp:AF139916_3 


gp.AF139916_2 


CD 
CO 

"3- 

LL 
O 
CO 
bl 
cn 


CD 
CD 
CO 

T— 

LL. 
< 

CD 


gp;AF139916_14 


sp;BLC CITFR 


5 

CD 
CO 
CO 

UL 
< 

CL 

CD 


LO 

O) 
CD 
CO 

T- 

Lt_ 
< 

U> 


gp:AF155804_7 


gp:SCE25_30 


prf: 242041 OP 




prf:2320284D 




sp:ABC_ECOLI 


~z. 

m 
< 

X 

<• 

Ol 
_] 
X 
o_ 

tf) 


prf:2517386A 


gp:SCE126J1 


ORF 
(bp) 


CO 
CD 
CQ 


1644 


CN 
^ — 

CO 


2190 


1146 


LO 
CO 

LO 


CO 
CO 


1425 


1404 


CO 

in 


2415 


r-- 
r-- 


CO 
LO 


CO 
CD 
CO 


co 

CO 


1080 


CD 

CO 


3012 




Terminal 
(nt) 


633079 


633532 


635178 


636089 


638317 


640208 


640232 


642557 


642556 


644778 


645176 


647593 


648315 


648440 


650187 


649114 


650392 


654612 


655122 


initial 
(nt) 


633474 


635175 


636089 


638278 


639462 


639624 


640879 


i 641133 


643959 


644026 


647590 


648309 


648467 


649105 


649342 


650193 


651288 


651601 


654676 


CO ^ ^ 


- CO 

G) 


4194 


4195 


4196 


4197 


4198 


cn 
CD 


o 
o 

CN 


4201 


CN 
O 
CN 


4203 


a 

CN 


LO 

o 

CN 


CO 

a 

CN 


r- 

a 

CN 


CO 

o 

CN 


4209 


o 
CN 


4211 


o n *< 

LU O 2 

co ^ £ 


r <r 

: cr 
1 cc 


%r 
o 

) CO 


LO 

CD 

to 


CO 
CD 
CO 


CD 
CO 


CO 
CD 

to 


cr 
CT 
<c 


c 


o 


Cv 

c 
r- 


CO 

o 


c 


LC 

I c 

- r- 


CC 

c 


> r- 
) c 
- r- 


cc 
c 

. li- 


CD 
O 
N- 


cz 
\> 





-165 - 



ZJ 

c 
c 



Function 


hypothetical membrane protein 




transcriptional repressor 


hypothetical protein 




transcriptional regulator (bir^Tamiiy) 


hypothetical protein 


iron-regulated lipoprotein precursor 


rRNA methylase 


methylenetetrahydrofolate 
dehydrogenase 


hypothetical membrane protein 


hypothetical protein 




homoserine O-acetyltransferase 


O-acetylhomoserine sulfhydryiase 


carbon starvation protein 




hypothetical protein 




Matched 
length 
(aa) 


CO 
CO 




CO 

o 

CN 


CO 

CM 




LO 
CN 


LO 


LO 
CO 


in 


CO 

r — 

CM 


o 

CO 


CO 
CO 




co 


CO 
CM 


o 

CD 
CD 




o 

LO 




milarity 
(%) 


56.0 




76.4 


61.7 




71.8 


78.3 


62.2 


86.1 


87.4 


76.3 


63.2 




99.5 


76.2 


78.4 




66.0 




CO 








































Identity 
(%) 


26.1 




50.3 


34.9 




42.5 


45.2 


31.1 


62.9 


70.9 


31.3 


34.0 




in 

CD 
CD 


49.7 


53.9 




40.0 




Homologous gene 


Streptomyces coelicolor A3(2) 
SCE9.01 




Mycobacterium tuberculosis 
H37Rv Rv2788 sirR 


Streptomyces coelicolor A3(2) 
SCG8A.05C 




Archaeoglobus fuigidus AF1676 


Streptomyces coelicolor A3(2) 
SC5H1.34 


Corynebacterium diphtheriae 
irp1 


Mycobacterium tuberculosis 
H37Rv Rv3366 spoil 


Mycobacterium tuberculosis 
H37Rv Rv3356c folD 


Mycobacterium leprae 
ML.CB1779.16c 


Streptomyces coelicolor A3(2) 
SC66T3.18C 




Corynebacterium glutamtcum 
metA 


Leptospira meyeri metY 


Escherichia coli K12cstA 




Escherichia coli K12yjiX 




db Match 


gp:SCE9_1 




pir:C70884 


<' 

CO 

O 
O 
CO 

CL 
O) 




pir:C69459 


co 

! 

X 
m 
O 
CO 
cL 
o> 


gp:CDU02617_1 


pir:E70971 


pir:C70970 


CO 

J 

5 

o 

_l 

O) 


gp;SC66T3J8 




gp:AF052652_1 


prf;2317335A 


sp:CSTA ECOLl 




sp:YJ!X ECOLl 




si 


1413 


CO 

CO 


O) 
CO 

CD 


CO 

CO 


CO 
ro 




CM 

CO 


CD 

CD 
CD 


r- 
^- 


CM 

LO 
CO 


LO 

in 

CM 


1380 


co 

CO 
CD 


1131 


1311 


2202 


CO 

o 

CO 


o 

CM 


o 

CO 


Terminal 
(nt) 


656534 


655097 


657215 


657205 


658142 


658928 


659424 


660538 


660650 


662017 


662374 


662382 


664126 


665183 


666460 


670465 


669445 


670672 


671045 


Initial 
(nt) 


655122 


655834 


656547 


658002 


658005 


658155 


658933 


659543 


I 661120 


661166 


662120 


663761 


665088 


666313 


667770 


668264 


CO 

in 

CD 
C 
r- 
cc 


CN 

I s - 

c 
I s - 
cc 


671653 


S g « 

co z £ 


4212 


C*" 


4214 


4215 


CC 

T— 

CN 


r- 

l CN 

■ "3 


4218 


4219 


4220 


4221 


4222 


4223 


CN 
CN 


4225 


CC 
CN 
CN 


h- 
CN 
CN 


CC 
CN 
CN 


cr 
1 CN 
1 CN 
■ "3 


4230 


UJ O 2 

CO Z £ 


r cn 




> 


LO 


CC 




CO 

r-- 


O) 


o 

CM 
r- 


CM 
r-- 


CM 
CM 
h- 


CO 

CM 
t"- 


o 

r- 


m 

1 CM 

r-- 


CC 
CN 

r- 


> r- 

I CN 

- r- 


- CC 
I CN 

I s - 


> a 
i O 
- I s - 


) o 
4 CO 

- r- 
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Z3 



CO 

_Q 



Function 


hypothetical protein 


carboxy phosphoenolpyruvate 
mutase 


citrate synthase 




hypothetical protein 




L-malate dehydrogenase 


regulatory protein 




vibriobactin utilization protein 


ABC transporter ATP-binding protein 


ABC transporter 


ABC transporter 


iron-regulated lipoprotein precursor 


chloramphenicol resistance protein 


catabolite repression control protein 


hypothetical protein 





Matched 
length 
(aa) 


I s - 

co 


x — 
CO 
CN 


o 

CO 
CO 




CO 
LO 




CO 
CO 
CO 


CD 
CN 
CNJ 




co 

CN 


CD 
CO 
CN 


CO 
CO 
CO 


o 

CO 
CO 


CD 
LO 
CO 


LO 
OO 
CO 


CO 

o 

CO 


CN 




milaiity 
(%) 


86.4 


76.2 


81.3 




62.3 




67.5 


62.8 




54.2 


85.1 


86.4 


88.2 


82.3 


69.6 


58.1 


85.8 




CO 






































Identity 
(%) 


71.0 


CO 


56.1 




34.0 




37.6 


T 

CD 
CN 




25.4 


55.4 


56.3 


63.0 


53.1 


32.2 


30.4 


56.2 





Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv1130 


Streptomyces hygroscopicus 


Mycobacterium smegmatis 
ATCC 607 gltA 




Escherichia coli K12yneC 




Methanothermus fervidus V24S 
mdh 


Bacillus stearothermophilus T-6 
| uxuR 




Vibrio cholerae OGAWA 395 
viuB 


Corynebacterium diphtheriae 
irpID 


Corynebacterium diphtheriae 
irpIC 


Corynebacterium diphtheriae 
irplB 


Corynebacterium diphtheriae 
irp1 


Streptomyces venezuelae crnlv 


Pseudomonas aeruginosa crc 


Haemophilus influenzae Rd 
HI1240 




db Match 


pir:C70539 


prf:1902224A 


CO 

O 
> 

CO 

o 

cL 

CD 




sp:YNEC ECOLl 




LU 
u_ 

LU 

1 

X 
Q 

bl 

CO 


prf.2514353L 




X 

o 

CD 

>. 

CD 

D 

> 

bl 
(ft 


~i 

CNJ 

o 

CD 
CD 

r*- 

T — 

LL 
< 

O) 


gp:AF176902_2 


gp:AF176902J 


gp:CDU02617J 


prf:2202262A 


prf:2222220B 


sp:YlCG_HAEIN 




Li_ -r- 

St 


LO 
CD 


CN 
CD 


1149 


o 

CO 
CD 


CN 
CD 


CN 

I s - 

CO 


1041 


o 
CN 

I s - 


CN 
O 

I s - 


I s - 

CD 
CO 


I s - 

o 

CO 


1059 


CD 
CO 
CO 


1050 


1272 


CN 
CO 


I s - 

LO 
CD 


LO 
CO 


Terminal 
(nt) 


672653 


673576 


674756 


672710 


674799 


675846 


675082 


676218 


677047 


680131 


681040 


681846 


682871 


683876 


686380 


687346 


688007 


688335 


Initial 
(nt) 


671700 


672665 


673608 


673639 


674990 


675175 


676122 


676937 


677748 


681027 


681846 


682904 


683866 


684925 


685109 


686435 


687351 


688141 


S° * 

CO z 2 


4231 


4232 


4233 


<r 

CN 


IT 
CD 
CN 
«3 


CC 

c 

CN 
Tt 


4237 


4238 


O] 
CO 

CN 

Ti- 


4240 


4241 


4242 


4243 


4244 


LC 
Tf 
CN 
TJ 


CC 

Tf 

Cv 
TJ 


4247 


4248 


CO 2 E 


: co 
I ^ 


CN 
CO 

I s - 


CO 
CO 


tj 

I s - 


■ IT 

) <r 


) CC 

> c 


> 

) CO 


CO 
CO 

I s - 


er 
<r 
I s - 


> o 

I s - 


\- 

tT 
I s - 


CN 
TT 

r— 


CO 
I s - 


tt 

TT 

I s - 


ir 

Tj 

I s - 


> tc 

' TJ 


) 

TT 


CO 

Tf 

I s - 



-167 - 



o 

-t— ■ t— CO 

CD CD ' 



CO 

CO 



CD 

.E 

V- * 

c: 
o 
o 



1= 



CD 



o 

E 
o 
X 



CO 



E 3 

03 



Sq « 

CO ^ 



lu y ^ 

GO ^ Q 



O 
CL 

to 
C 
CO 



O 

CD 

E 

p 



CD 

o 

CO 
_Q 

31 



CN 

CD 



> 



O 
CO 



co 
oo 

CD 



CO 



LU 



CO 



E 3 
cd £: 

CL CL 



O 

CO 



CD 

Id 



o 

CL 



c 

'CD 



CD 

o 



LO 



O 
CO 



LU 



CD 
CD 
CD 

o 

CD 
CD 



O 
CM 



O 

in 



E Q 

-== o 

CO CO 

CO "O 



CN 

CO 



CD 

oi 
in 



CO 




to 




o 








o 




CD 




XI 












E 


T 


.3 


CO 


ter 


CO 

> 


o 




CO 


-Q 


> 


O 




O 




>s CO 


^ X 



CD O 

o °o 
E o 

to 

CD CD 
4= O 
CO CO 



CN 



O 



CN 
CN 



o 
in 

CD 



CD 
CD 
CO 



in 
in 
CN 



in 



m 
CD 

CD 

CD 
CD 



CD 

in 
CN 



CD 

in 



CO 



O 

CD 

o 

CO 



CO 

in 

CO 



CD 
o 



lad 




c 




CD 

s 




CL 




atory 




regul 








CD 




bad 


E 



drolase 




ohy 








*£ 




d a 




o 




CO 




o 








mi 


CD 

CO 


CO 


CO 


Jj 


"D 


1 

*>> 
o 


ep1 


CO 


CL 




o 



CN 
CD 



in 



CN 
CN 

a> 

CO 
CO 
CD 



m 
CN 



in 



IS! 

So 

CO CO 



CO 
CO 

CO CD 



CO 

c 

CD 
CD 
O 

CD 
TJ 

CD 
"D 

*E 

CO 

o 

CL 



CO 
CD 



CD 



CD 
CO 
JO 

o 



(0 

o 



CO 

> 



o 
o 



CO 

CM ^ 
CN I CO 



CO 
CO 

CD C 

-~ o 
c in 

.2 8 

1* 
5 > 

o 

o r- 

>->co 
S X 



o 

X 



co 

CO 



CO 
CO 

o 
o 



oo 
CD 
CD 
CD 
CD 
CD 



CM 
CO 



CN 
CD 
CN 



CD 
CN 



CD 



o 
o 



CD ~0 
CM 



CN 



CN 
CD 



O 
> 
_j 

< 

X 

I 

X 
Q 
_l 
Q 

bl 



Is. 

*- CL 

cd ~~ 
co 

*CM 
^ CN 

O to 



CD 
« 

O 

D_ 



O 
CL 



CN 

CD 
CM 



CM 



CO 
CD 
CD 



CD 

§ CN 
.5 co 

.2 > 
o a: 

CO 

-g ^ 

o r^- 
>*co 



o 

CO 



CN 



o 



o 



m 
to 

CM 

t* 
CL 



o 

CM 

co 



o 



o 



CD 
CM 



CD 



o 

CO 
CO 
co 
o 



o 
> 



CN 

Q 

>- 



o 

CO 



o 



in 

CD 
CN 



in 

CD 



CO 

o 

CO 

o 



E <? 

O T- 

i= O 

CO CO 



o 
co. 



O 

CO 

bl 

£3) 



CD 

CO 



CN 
o 



o 

CD 
CM 



CD 



-168 - 



3 



Function 


hypothetical protein 


thioredoxin reductase 


PrpD protein for propionate 
catabolism 


carboxy phosphoenolpyruvate 
mutase 


hypothetical protein 


citrate synthase 




hypothetical protein 






thiosulfate sulfurtransferase 


hypothetical protein 


hypothetical protein 


hypothetical membrane protein 


hypothetical protein 


hypothetical protein 


detergent sensitivity rescuer or 
carboxyl transferase 


detergent sensitivity rescuer or 
carboxyl transferase 


Matched 
length 
(aa) 


CO 

CO 


LO 
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CN 
LO 


CO 
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CN 
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CO 




CD 
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LO 
CN 
CN 


CM 
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CO 


CO 
CO 


CO 
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CO 


CO 
CD 


h- 

CO 

LO 


CO 
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Similarity 
(%) 


69.0 


59.3 


49.5 


74.5 


47.0 


78.9 




72.6 






100.0 


79.8 


76.7 


63.4 


66.2 
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24.0 
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CO 
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100.0 


61.1 


51.1 


35.1 


31.8 


33.3 


99.8 


99.6 


Homologous gene 


Bacillus subtilis 168 yciC 


Bacillus subtilis IS58 trxB 


Salmonella typhimurium LT2 
prpD 


Streptomyces hygroscopicus 


Aeropyrum pernix K1 APE0223 


Mycobacterium smegmatis 
ATCC 607 gltA 




Mycobacterium tuberculosis 
|H37Rv Rv1129c 






Corynebacterium glutamicum 
ATCC 13032 thtR 


Campylobacter jejuni CJ0069 


Mycobacterium leprae 
MLCB4.27c 


Mycobacterium tuberculosis 
H37Rv Rv1565c 


Escherichia coli K12yceF 


Mycobacterium leprae B1308- 
C3-211 


Corynebacterium glutamicum 
AJ11060dtsR2 


Corynebacterium glutamicum 
AJ11060 dtsR1 
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Function 


hypothetical protein 


dTDP-Rha:a-D-GIcNAc- 
diphosphoryl polyprenol, a-3-L- 
rhamnosyl transferase 


mannose-1 -phosphate 
guanylyltransferase 


regulatory protein 


hypothetical protein 


hypothetical protein 


phosphomannomutase 


hypothetical protein" 


mannose-6-phosphate isomerase 






pheromone-responsive protein 




S-adenosyl-L-homocysteine 
hydrolase 






thymidylate kinase 


Matched 
length 
(aa) 


h- 

CN 
LD 


05 
CO 
CN 


CO 
LO 
CO 


CO 


O) 
CO 


CD 
CO 


o 

CO 


CN 

CO 


O 
CN 






o 

CO 
r — 




CD 






CD 

o 

CN 


Similarity 
(%) 


71.4 


77.9 


66.9 


81.9 


74.8 


71.3 


66.3 


56.3 


66.2 






57.8 




83.0 






56.0 


Identity 
(%) 


45.5 


56.4 


29.8 


T 
CO 


48.9 


LO 
LO 


38.0 


CN 
CO 


36.9 






35.6 




59.0 






25.8 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3267 


Mycobacterium smegmatis 
mc2155 wbbL 


Saccharomyces cerevisiae 
YDL055C MPG1 


Mycobacterium smegmatis 
whmD 


Mycobacterium tuberculosis 
H37Rv RV3259 


Streptomyces coelicolor A3(2) 
|SCE34.11c 


Salmonella montevideo M40 
manB 


Mycobacterium tuberculosis 
H37Rv Rv3256c 


Escherichia coli K12manA 






Enterococcus faecalis plasmid 
pCF10 prgC 




Trichomonas vaginalis WAA38 






Archaeoglobus fulgidus VC-16 
AF0061 
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787045 


787983 


787170 


788546 


790093 


788719 


789002 


790704 
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(nt) 


778711 
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780128 


781468 


782617 
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Function 


regulatory protein 


hypothetical protein 


hypothetical protein 


DEAD box ATP-dependent kna 
helicase 


hypothetical protein 


hypothetical protein 


ATP-dependent DNA helicase 


ATP-dependent DNA helicase 




potassium channel 


hypothetical protein 
DNAhelicase II 




hypothetical protein 


Matched 
length 
(a.a) 


CO 


CO 
CN 


LO 


CO 
LO 


CN 


CO 
CN 


1155 


1126 




CN 
O 
CO 


o o 

CO CO 
CN CO 




o 

CO 
CN 


Similarity 

(%) 
_ 


96.4 


65.1 


62.2 


64.0 
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64.2 
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41.4 




26.2 
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Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv3219 whiB1 


Mycobacterium tuberculosis 
H37RV Rv3217c 


Mycobacterium tuberculosis 
H37Rv Rv3212 
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Function 


myo-inositol monophosphatase 

— — 


peptide chain release factor 2 


cell division ATP-binding protein 


hypothetical protem 


cell division protein 


small protein B (SSRA-binding 
protein) 


hypothetical protein 








vibriobactin utilization protein 


Fe-regulated protein 


hypothetical membrane protein 


ferric anguibactin-binding protein 
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ferrichrome ABC transporter 
(permease) 


ferrichrome ABC transporter 
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ferrichrome ABC transporter (A I r- 
binding protein) 
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35.6 
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Homologous gene 


Streptomyces flavopersicus 
spcA 


Streptomyces coeiicolor A3(2) 
prfB 


Mycobacterium tuberculosis 
H37Rv Rv3102cftsE 


Aeropyrum pernix K1 APE2061 


Mycobacterium tuberculosis 
H37Rv Rv3101cftsX 


Escherichia coli K12 smpB 


Escherichia coli K12yeaO 








Vibrio cholerae OGAWA 395 
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845181 
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Function 


hypothetical protein 


hypothetical protein 


kynurenine 

aminotransferase/glutamine 
transaminase K 




DNA repair helicase 


hypothetical protein 

i — - — 


hypothetical protein 




resuscitation-promoting factor 


cold shock protein 


hypothetical protein 


glutamine cyclotransferase 
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_ 
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67.8 
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CO 
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30.7 


36.1 
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39.4 


42.6 


28.3 


CO 






43.6 




27.9 




Homologous gene 


Chlamydia muridarum Nigg 
TC0129 


Chlamydia pneumoniae 


Rattus norvegicus (Rat) 




Saccharomyces cerevisiae 
S288C YIL143C RAD25 


Mycobacterium tuberculosis 
H37RV Rv0862c 


Mycobacterium tuberculosis 
H37Rv Rv0863 




Micrococcus luteus rpf 


Lactococcus lactis cspB 


Mycobacterium leprae 
MLCB57.27C 


Deinococcus radiodurans 
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860473 
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867571 


I 868630 


867803 


869318 
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870721 
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872016 
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860745 
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865066 
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870691 
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873213 
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Function 


hypothetical protein 


phosphoserine transaminase 


acetyl-coenzyme A carboxylase 
carboxy transferase subunit beta 


hypothetical protein 
sodium/proline symporter 




hypothetical protein 


fatty-acid synthase 




homosenne O-acetyltransterase 






glutaredoxin 


dihydrofolate reductase 
thvmidvlate synthase 


ammonium transporter 


ATP dependent DNA helicase 


formamidopyrimidme-UNA 
glycosidase 
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47.4 
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Homologous gene 


Mycobacterium tuberculosis 
H37Rv Rv0883c 


Bacillus circulans ATCC 21783 


Escherichia coli K12accD 
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Mycobacterium avium folA 
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Function 


UTP-glucose-1-pnospriate 
uridylyttransferase 


molybdopterin biosynthesis protein 


ribosomal-protein-alanine N- 
acetyltransferase 


hypothetical membrane protein 


cyanate transport protein 




hypothetical membrane protein 


hypothetical membrane protein 


cyclomaltodextrinase 


hypothetical membrane protein 


hypothetical protein 


methionyl-tRNA synthetase 


ATP-dependent UNA heiicase 


hypothetical protein 
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Function 


transposase 


transposase subunit 




D-lactate dehydrogenase 


site-specific DNA-methyltrai 




transposase 


transposase 


transcriptional regulator 


cadmium resistance proteir 




hypothetical protein 


hypothetical protein 


dimethyladenosine transfer 


isopentenyl monophosphal 




ABC transporter 


pyridoxine kinase 


hypothetical protein 


hypothetical protein 
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hypothetical protein 
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hVDothetical protein 
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hypothetical membrane protein 


two-component system sensor 
histidine kinase 


two component transcriptional 
regulator (luxR family) 


hypothetical membrane protein 
ABC transporter 


ABC transporter 
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transposase protein fragment 
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Function 


molybdopterin biosynthesis cnxl 
protein (molybdenum cofactor 
biosynthesis enzyme cnxl ) 


extracellular serine protease 
precurosor 




hypothetical membrane protein 


hypothetical membrane protein 


molybdopterin guanine dinucleotide 
synthase 


molybdoptein biosynthesis protein 


molybdopterin biosynthsisi protein 
Moybdenume (mosybdenum 
cofastor biosythests enzyme) 


edium-chain fatty acid-CoAligase 


Rho factor 








peptide chain release factor 1 


protoporphyrinogen oxidase 




hypothetical protein 


undecaprenyl-phosphate alpha-N- 
acetylglucosaminyltransferase 
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65.0 


45.9 




62.6 
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73.7 


65.7 
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86.0 


58.4 
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31.6 
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31.1 


Homologous gene 


Arabidopsts thaliana CV cnxl 


Serratia marcescens strain IFO- 
3046 prtS 




Mycobacterium tuberculosis 
H37Rv Rv1841c 


Mycobacterium tuberculosis 
H37Rv Rv1842c 


Pseudomonas putida mobA 


Mycobacterium tuberculosis 
H37Rv Rv0438c moeA 


Arabidopsis thaliana cnx2 


Pseudomonas oleovorans 


Micrococcus luteus rho 








Escherichia coli K12 RF-1 


Escherichia coli K12 




Mycobacterium tuberculosis 
H37Rv Rv1301 


Escherichia coli K12 rfe 
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1265611 


1265427 
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1269343 


1268267 
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1271192 


Initial 
(nt) 


1254146 
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1257858 
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Function 




hypothetical protein 


ATP synthase chain a (protein 6) 


H-Mransporting ATP synthase lipid- 
binding protein. ATP synthase C 
chane 


H-Mransporting ATP synthase chain 
b 


H+-transporting ATP synthase delta 
chain 


H+-transporting ATP synthase alpha 
chain 


H+-transporting ATP synthase 
gamma chain 


H+-transporting ATP synthase beta 
chain 


H+-transporting ATP synthase 
epsilon chain 


hypothetical protein 


hypothetical protein 


putative ATP/GTP-binding protein 


hypothetical protein 


hypothetical protein 


thioredoxin 




Matched 
length 
(a.a) 




o 

CO 


LO 
CN 




LO 


r- 

CN 


CO 
LO 


o 

CN 
CO 


CO 
CO 


CN 
CN 


CN 
CO 


o 

CO 
CN 


LO 
o> 


co 


r— 

o 


5 

CO 




milarity 
(%) 




99.0 


56.7 


85.9 


66.9 


67.2 


88.4 


76.6 


100.0 


73.0 


67.4 


85.7 


56.0 
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Homologous gene 




Corynebacterium glutamicum 
atpl 


Escherichia coli K12atpB 


Streptomyces lividans atpL 


Streptomyces lividans atpF 


Streptomyces lividans atpD 


Streptomyces lividans atpA 


Streptomyces lividans atpG 


Corynebacterium glutamicum 
AS019atpB 


Streptomyces lividans atpE 


Mycobacterium tuberculosis 
H37RV Rv1312 


Mycobacterium tuberculosis 
H37RV Rv1 321 


Streptomyces coelicolor A3(2) 


Bacillus subtilis yqjC 


Mycobacterium tuberculosis 
H37Rv RV1898 


Mycobacterium tuberculosis 
H37Rv Rv1 324 
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Function 


FMNH2-dependent aliphatic 
sulfonate monooxygenase 


alphatic sulfonates transport 
permease protein 


alphatic sulfonates transport 
permease protein 


sulfonate binding protein precursor 


1,4-alpha-glucan branching enzyme 
(glycogen branching enzyme) 


alpha-amylase 


t 

I 


ferric enterobactin transport A i k- 
binding protein or ABC transport 
ATP-binding protein 


hypothetical protein 


hypothetical protein 




eiectron transfer flavoprotein oeta- 
subunit 


electron transfer flavoprotein aipna 
subunit for various dehydrogenases 




nitrogenase cofactor sythesis protein 




hypothetical protein 
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Homologous gene 


Escherichia coli K12ssuD 


Escherichia coli K12ssuC 


Escherichia coli K12ssuB 


Escherichia coli K12 ssuA 


Mycobacterium tuberculosis 
H37Rv Rv1326cglgB 


Dictyoglomus thermophilum 
amyC 




Escherichia coli K12fepC 


Mycobacterium tuberculosis 
H37Rv Rv3040c 


Mycobacterium tuberculosis 
H37Rv Rv3037c 




Rhizobium meliloti fixA 


Rhizobium meliloti fixB 




Azotobacter vinelandii nifS 




Rhizobium sp. NGR234 plasmi 
pNGR234a y4mE 
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Function 




glucose-resistance amylase 
regulator (catabolite control protein) 


ripose transport ATP-bmdmg protein 


high affinity ribose transport protein 


periplasmic ribose-binding protein 


high affinity ribose transport protein 

_ 


hypothetical protein 


iron-siderophore binding lipoprotein 


Na-dependent bile acid transporter 


RNA-dependent amidotransferase B 


putative F420-dependent NADH 
reductase 


hypothetical protein 


hypothetical protein 


hypothetical membrane protein 




dihydroxy-acid dehydratase 


hypothetical protein 
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Homologous gene 




Bacillus megaterium ccpA 


Escherichia coli K12 rbsA 


Escherichia coli K12 MG1655 
rbsC 


Escherichia coli K12 MG1655 
rbsB 


Escherichia coli K12 MG1655 
rbsD 


Saccharomyces cerevisiae 
YIR042C 


Streptomyces coelicoior 
SCF34.13C 


Rattus norvegicus (Rat) NTCI 


Staphylococcus aureus WHU 
ratB 


Methanococcus jannaschii 
MJ1501 f4re 


Escherichia coli K12yoJG 


Mycobacterium tuberculosis 
H37Rv Rv2972c 


Mycobacterium tuberculosis 
H37Rv Rv3005c 




Corynebacterium glutamicum 
ATCC 13032 ilvD 


Mycobacterium tuberculosis 
H37Rv Rv3004 
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Function 


hypothetical protein 


30S ribosomal protein S1 




hypothetical protein 










inosine-uridine preferring nucleoside 
; hypolase (purine nucleosidase) 


aniseptic resistance protein 


ribose kinase 


criptic asc operon repressor, 
ranscription regulator 




excinuclease ABC subunit B 


hypothetical protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical protein 


hydrolase | 
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Function 


excinuclease ABC subunit A 


hypothetical protein 1246 (uvrA 
region) 


hypothetical protein 1246 (uvrA 
region) 






translation initiation factor IF-3 


; 50S ribosomal protein L35 


50S ribosomal protein L20 






sn-glycerol-3-phosphate transport 
system permease protein 


sn-glycerol-3-phosphate transport 
system protein 


sn-glycerol-3-phosphate transport 
system permease proein 


sn-glycerol-3-phosphate transport 
ATP-binding protein 


hypothetical protein 


glycerophosphoryl diester 
phosphodiesterase 


tRNA(guanosine-2'-0-)- 
methtytransferase 


phenylalanyl-tRNA synthetase alpha 
chain 
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Homologous gene 


Escherichia coli K12 uvrA 
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Function 


cytidylate kinase 


GTP binding protein 






methyltransferase 


ABC transporter 


ABC transporter 




hypothetical membrane protein 




Na+/H+ antiporter 






hypothetical protein 


2-hydroxy-6-oxohepta-2,4-dienoate 
hydrolase 


preprotein translocase SecA subunit 


signal transduction protein 


hypothetical protein 
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Function 


hypothetical protein 










hemolysin 


hemolysin 




DEAD box RNA helicase 


ABC transporter ATP-binding protein 


6-phosphogluconate dehydrogenase 


thioesterase 




nodulation ATP-binding protein I 


hypothetical membrane protein 
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phosphonates transport system 
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phosphonates transport ATP-binding 
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precorrin 2 methyltransferase 


precorrin-6Y C5, 15- 
methy (transferase 






oxidoreductase 


dipeptidase orX-Pro dipeptidase 




ATP-dependent RNA helicase 
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hypothetical protein 


hypothetical protein 
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Mycobacterium tuberculosis 
H37Rv Rv2111c 


Mycobacterium tuberculosis 
H37RvRv2112c 


Aeropyrum pernix K1 APE2014 


db Match 


;pir:C70764 


sp:COBL__PSEDE 






ZD 

H 
O 
> 

c, ! 

> 
> 

CL 

to 


i 

o 

CD 

o 
LL 

< 
CL 

cn 




t— 
< 

LU 

I 

xT 
H 
CL 

to 


_j 
o 
o 

LU 

1 

o 

CL 

to 


UJ 
_! 
O 
> 

1 

CO 

>- 
> 

CL 

to 


ZD 
h- 
O 
> 

CO 

> 
> 
cL 
to 


UJ 

1 

O 
> 

CD 1 
CO 

> 

>- 

CL 

to 


sp:YY37_MYCTU 




pir:B70512 


pir:C70512 


PIR:H72504 


ORF 




1278 


CO 

CD 
CO 


CO 
CN 


CO 
CO 


1137 


CD 

CO 

CD 


2787 


1002! 


LO 
CO 


T— 

CO 
CD 


CN 
CD 


1425 


CD 
CN 


CN 
CD 


1542 


0 

CO 


Terminal 
(nt) 


1562553 


1562525 


1564237 


1564482 


1564565 


1565302 


1567106 


1567117 


1569932 


1571068 


1571506 


1572492 


1573491 : 


1575205 


1574945 


1575406 


1577806 


Initial 
(nt) 


1561780 


1563802 


1563872 


1564237 


1565302 


1566438 


1566468 


1569903 


1570933 


1571382 


1572486 


1573463 


1574915! 


1574957 


1575136 


1576947 


1577327 


SEQ 
NO. 
(a.a.) 


5137 


5138 


5139 


5140 


5141 


5142 


| 5143 


5144 


5145 


5146 


5147 


5148 


5149 


5150 


5151 


5152 


5153 


° rS < 
ujOz 


r^- 

CO 
CD 


1638 


CD 
CO 
CD 


o 

CD 


CD 


CN 
CD 


CO 
CD 


co 


in 

CD 


CD 
CD 


CO 


CO 
CD 


CD 
XT 
CD 


0 

LO 
CD 


x— 

10 

CO 


CN 
LO 
CD 


CO 

in 

CD 



-2 1 6 



-O 



Function 


AAA family ATPase (chaperone-like 
function) 


protein-beta-aspartate 
methyltransferase 


aspartyl aminopeptidase 


hypothetical protein 


virulence-associated protein 


quinolon resistance protein 


aspartate ammonia-lyase 


ATP phosphoribosyltransferase 


beta-phosphoglucomutase 


5-methyltetrahydrofolate- 
homocysteine methyltransferase 




alkyl hydroperoxide reductase 
subunit F 


arsenical-resistance protein 


arsenate reductase 


arsenate reductase 




cysteinyl-tRNA synthetase 


Matched 
length 


m 

*t 

LO 


CO 
CN 


CO 
CO 


CD 
CD 
CN 


CD 
CD 


LO 
00 
CO 


CO 
CN 
LO 


CO 
CN 


LO 


1254 




CD 
CD 
CO 


CO 

CO 
CO 


CO 

CN 


CO 

CN 




00 
CO 


Similarity 
(%) 


78.5 


79.0 


67,2 


71.4 


72.5 


o 

CO 


99.8 


97.5 


63.1 


62.4 




49.5 


63.9 


64.3 


75.6 




64.3 


Identity 
(%) 


51.6 


57.3 


38.1 


45.4 


40.6 


CO 
CN 


99.8 


96.8 


30.8 ' 


CD 
CO 




22.4 


33.0 


32.6 


47.2 




35.9 


Homologous gene 


Rhodococcus erythropolis arc 


Mycobacterium leprae pimT 


Homo sapiens 


Mycobacterium tuberculosis 
H37Rv Rv2119 


Dichelobacter nodosus A198 
vapl 


Staphylococcus aureus norA23 


Corynebacterium glutamicum 
(Brevibacterium flavum) MJ233 
aspA 


Corynebacterium glutamicum 
AS019hisG 


Thermotoga maritima MSB8 
TM1254 


Escherichia coli K12 metH 




Xanthomonas campestris ahpF 


Saccharomyces cerevisiae 
S288C YPR201W acr3 


Staphylococcus aureus plasmid 
p!258 arsC 


Mycobacterium tuberculosis 
H37Rv arsC 




Escherichia coli K12 cysS 


db Match 


prf:2422382Q 


co 
CN 

00 
L.' 
*CL 


gp:AFO05O5OJ 


pir:B70513 


O 

o 
< 

CD 

I 

Q. 
< 
> 
bl 

CO 


prf:2513299A 


sp:ASPA_CORGL 


gp:AF050166J 


pir:H72277 


sp:METH_ECOLI 




X 

o 

CL 
X 
< 
bl 

ifi 


H 

CO 

< 

LU 

CO 

cn 
o 
< 

bl 

CO 


sp:ARSC_STAAU 


pir:G70964 




— i 
O 
o 
in 

o' 

> 

CO 

bl 
tn 


ORF 

t D P) 


1581 


CO 
CO 


1323 


co 

CO 


CD 
CN 


1209 


1578 


CO 

co 


CO 
CD 
CO 


3663 


o 
to 


1026 


1176 


o 
CN 


CD 
CO 
CO 


CO 
CO 


1212 


Terminal 
(nt) 


1576951 


1578567 


1579449 


1581640 


1582114 


1582273 


1583913 


1585603 


1586812 


1587573 


1591912 


1591941 


1594512 


1594951 


1595668 


1595844 


1596249 


Initial 
(nt) 


1578531 


1579400 


1580771 


1580807 


1581851 


1583481 


1585490 


1586445 


1587504 


1591235 


1591343 


1592966 


1593337 


1594532 


1595030 


1596221 


1597460 


CO ^ ^ 


5154 


5155 


5156 


5157 


5158 


5159 


5160 


5161 


5162 


5163 


5164 


5165 


5166 


5167 


5168 


5169 


5170 


a r< < 


lo 
CD 


LO 

m 
CD 


CD 

m 
CD 


h- 

LO 

CD 


CO 

m 
CD 


CD 

in 
CD 


o 

CO 
CD 


CD 
CD 


CN 
CD 
CO 


CO 
CD 
CO 


CD 
CD 


LO 
CO 
CD 


CD 
CD 
CD 


1667 


CO 
CO 
CD 


CO 
CD 
CD 


1670 



217 - 



CD 
ifi 
_tg 

3 



O 
~0 



C 

"5 

Cl 

O 



CD 

to 

CO 
CO 

O 

CL 



O 
JZ 

g 

JQ 
C 



O 
c 

o _ 

&§> 

_Q CD 



0 "O 

~ c 

IS <° 
^ s 

a) o 

&-S 

to > 
CD w 

g£ 

01 to 
CD C 

Q_ CD 
O) 

CO 



<U CD 

CO CO 

CO ZD 

CD CD 

2 JQ 



O 
Q_ 
CO 



O 

CO 

< 



CD 

to 
CO 



o 

CL 



o 

CO 

< 



o 
CO 



E 
o 



CD 

a) 

CO 

c tO d) 
as <d "a> 
c >> 
c £ " 

■p-2-g- 
!i.E 



E 

o 



CD ^ ^ 

-§ o> cd 

• CO 
TO CD — ' 



CO 

CM 
CO 



in 

CO 



O 

CD 

CO 



CO 

CO 



to 



CO 
CO 



_C0 

to 



CO 

cm 



LO 



o 
co 



CO 
CO 



<o 



Si 



o 



o 
CD 



CD 
CO 



GO 

to 



oi 



-a 

CD 
ZS 

c 

o 
a 



CD 



c 

CD 
CD 



o 

CO 

_o 
o 
E 
o 



CD 



O 
CO 

-a < 

CO O 

< E 



~C CL 

cc — 

_Q > 

o a 

>»co 
^ X 



< 



CD 
CO 



CO 

o 

E 
o 
"o 

CD 
</) 

Q_ 



CQ 
_Q 

CM 



T3 

"5) 
c 
'tz 
CD 

E 

CD 



LU 



a) 
~o 
CO 
-Q 
CD 

O i> 



CD 
CM 



£ 



CD 
CO 

o 
c 



c?3 



o 

CO 
CL 



CO 

to 



CO 



D) 



CM 

5 



g 

CD 



CO 
LU 



CO 

x — 

CM 
CM 

CL 



I s - 
in 



I s - 

1_L 



< 
en 
cj 

CD 
I— 
CO 
LU 
CO 
CL 
cL 

CO x 



O 
O 

LLI 

I 

m 

x 

CD 



CO 
CM 
co 

I s - 
CO 

a 



< 

CM 
o 
CO 
CO 

to 
CM 

CL 



m 

CM 

CD 
CO 
CO 

LO 

CM 

i= 

CL 



LO 

o 



o 
a 

LU 

*■ 

a 
< 



a: 

O : 



CO 
^3" 



C0 
co 

CD 



CO 

CM 



CO 



co 

CM 



I s - 

00 

LO 



co 
o 
co 



CO 

CO 

o 



CO 
CO 
CO 



I s - 
h- 

CO 

o 
o 

CO 



CO 
CM 
CO 

o 



LO 

o 

CO 



CO 
CO 
CO 
CO 

o 

CO 



CO 
CO 
LO 

o 
to 



CO 

CO 

I s - 
o 

CO 



CM 

co 
CO 

o 

CO 



o 

LO 



.55 ^ 



1^- 

CO 
CO 
CO 
CO 



CO 

CO 
CO 
CO 



o 

CM 

LO 
CO 

o 



LO 

o 

CO 



LO 

o 

CO 



1^- 

LO 
CO 

o 

CO 



XT 

CM 
CO 

o 

CO 



CM 
CO 



CO 
CO 
CM 

o 



CO 
CO 
CM 
CM 



So s 

CO ^ S 



CM 



CO 



CO 



o 
co 



CM 
CO 



NT 
CO 



LO 
CO 



CO 
CO 



CO 



a n - < 

LU O ^ 
CO ^ Q 



CM 
CO 



CO 
CO 



CO 



I s - 
co 



o 

CO 
CO 



CM 
CO 
CO 



CO 
CO 



LO 

CO 
CO 



CO 
CO 
CO 



00 
CO 



-2 18 - 



Function 


methylmalonyl-CoAmutase beta 
subunit 


hypothetical membrane protein 




hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein 




ferrochelatase 


invasin 




aconitate hydratase 


transcriptional regulator 


GMP synthetase 


hypothetical protein 


hypothetical protein 




hypothetical protein 


Matched 
length 
(aa) 


o 

r— 

CO 


CXI 

CN 




o 

CO 


-a- 


T — 

CD 

CN 




CD 
CO 


CO 




CO 

LO 

CO 




LO 
CO 

CN 


CN 

CN 


CD 
CO 




CD 
XT 


Similarity 
(%) 


68.2 


70.1 




87.0 


78.7 


CO 
CN 




65.7 


56.5 




85.9 


81.6 


CO 
LO 


62.0 


80.2 




86.1 


Identity 

(%) 


41.6 


39.7 




64.1 


44.7 


51.0 




36.8 


25.5 




CO 

CO 
CD 


54.6 


CO 
CM 


32.6 


37.2 




CM 
CO 


Homologous gene 


Streptomyces cinnamonensis 
A3823.5 mutA 


Mycobacterium tuberculosis 
H37Rv Rv1491c 




Mycobacterium tuberculosis 
H37RvRv1488 


Mycobacterium tuberculosis 
H37Rv Rv1487 


Streptomyces coelicolor A3(2) 
SCC77.24 




! Propionibacterium freudenreichii 
subsp. Shermanii hemH 


Streptococcus faecium 




Mycobacterium tuberculosis 
H37Rvacn 


Mycobacterium tuberculosis 
H37Rv Rv1474c 


Methanococcus jannaschii 
MJ1575 guaA 


Streptomyces coelicolor A3(2) 
SCD82.04c 


Methanococcus jannaschii 
MJ1558 




Neisseria meningitidis MC58 
NMB1652 


db Match 


o 

j_ 
03 

s' 

ID 

bl 
co 


sp:YS13__MYCTU 




ZD 
\~ 
O 
> 

I 

CO 

o 

> 
bl 

CO 


pir:B70711 


CN 

rJ 

o 
o 

CO 

bl 

O) 




sp:HEMZ_PROFR 


sp:P54_ENTFC 




pir:F70873 


pir:E70873 


pir:F64496 


CO 

Q 
O 
co 
bl 
cn 


pir:E64494 




gp:AE002515_9 


LL ~ 


1848 


CO 
CM 


CJ> 
LO 


1296 


LO 
CO 
-xt 


CO 
CO 


CO 

CO 


1110 


1800 


CO 
CD 


2829 


CO 
LO 


CD 

LO 


CO 
CD 
CD 


N- 
CD 
CN 


CO 

cn 
CO 


1392 


Terminal 
(nt) 


1614451 


1617300 


1617994 


1618321 


1619672 


1620167 


1621838 


1621841 


1623027 


1625428 


1629107 


1629861 


: 1630668 


1630667 


1631926 


1631353 


1633324 


Initial 
(nt) 


1616298 


1616578 


1617398 


1619616 


1620106: 


1621009 


1621056 


1622950 


1624826 


1625925 


1626279 


1629298 


1629913 


1631329 


1631660 


1631745 


1631933 


SEQ 
NO. 
(a.a.) 


5189 


5190 


5191 


5192 


5193 


5194 


5195 


5196 


5197 


5198 


5199 


5200 


5201 


5202 


5203 


5204 


5205 


C n < 
ujOz 


CO 

co 

CD 


o 

CO 
CD 


\ — 

CO 
CO 


CM 
CO 
CO 


CO 
CO 
CO 


xr 

CO 
CO 


LO 
CO 
CO 


CD 
CO 
CO 


h- 

CO 
CO 


CO 
CO 
CO 


CO 
CO 
CO 


o 
o 


r— 
O 


CN 
O 


CO 

o 
r— 


o 


LO 

o 



-21 9 



03 



Function 


antigenic protein 


antigenic protein 


cation-transporting ATPase P 




hypothetical protein 










host cell surface-exposed lipoprotein 


integrase 


ABC transporter ATP-binding protein 




sialidase 


transposase (IS1628) 


transposase protein fragment 


hypothetical protein 




dTDP-4-keto-L-rhamnose reductase 


nitrogen fixation protein 


Matched 
length 

(a.a) 


CO 


CN 

to 


CO 
CO 
CO 




o 
CN 










o 

T — 


LO 

-c — 


CD 




co 

CO 


CD 
CO 
CN 


CO 


CO 
CO 




o 


CD 


>* 










































CO ^ 


60.0 


69.0 


73.2 




58.3 










73.8 


60.4 


64.4 




72.4 


100.0 


72.0 


43.0 




70.1 


85.2 


CO 










































Identity 
(%) 


54.0 


59.0 


42.6 




35.8 










43.0 


34.4 


32.8 




CO 

T 

LO 


99.6 


64.0 


32.0 




32.7 


63.8 


Homologous gene 


Neisseria gonorrhoeae ORF24 


Neisseria gonorrhoeae 


Synechocystis sp. PCC6803 
SII1614 pmal 




Streptomyces coelicolor A3(2) 
SC3D1 1.02c 










Streptococcus thermophilus 
phage TP-J34 


Corynephage 304L int 


Escherichia coli K12yjjK 




Micromonospora viridifaciens 
ATCC31146nedA 


Corynebacterium glutamicum 
22243 R-plasmid pAG1 tnpB 


Corynebacterium glutamicum 
TnpNC 


Plasmid NTP16 




Pyrococcus abyssi Orsay 
PAB1087 


Mycobacterium leprae 
MLCL.536.24c nifU7 


db Match 


GSP:Y38838 


GSP:Y38838 


CO 

> 

> 
CO 

I 

< 

t/3 




5 

CO 

o 

CO 
cL 

CD 










prf:2408488H 


prf:2510491A 


_j 
o 
o 

LU 

— > 
—> 

> 

CO 




> 
o 

i 

IE 

< 

C/3 


CO 

I 

CD 

o 
o 

CN 

LL. 
< 


GPU:AF164956_23 


GP:NT1TNIS_5 




pir:B75015 


pir:S72754 


ORF 
\ u t J } 


o 

CO 


CD 
LO 

■sr 


2676 


CO 
CO 
r- 


CD 
CO 


1362 


LO 

CO 


CD 
LO 


CN 
CO 


LO 

co 


CO 
LO 

•*r 


1629 


1476 


1182 


CO 

O 
r-~ 


CO 
TT 
CN 


s 

CN 


LO 
CO 
LO 


CO 
CN 




Terminal 
; (nt) 


| 1632109 


1632682 


1636241 


1633781 


1636244 


1638442 


1638776 


1639520 


1639817 


1640155 


1641001 


1641046 


1642743 ! 


1644318 


1646368 


1646063 


I 1645601 


i 1647133 


1647212 


1647651 


Initial 
(nt) 


1632588 


1633137 


1633566 


1634563 


1636732 


1637081 


1639132 


1639365 


1639656 


1639781 


1640546 


1642674 


1644218 


1645499 


1645661 


1645821 


1645861 


1646549 


1647634 


1648097 


SEQ 
NO. 
(a.a.) 


5206 


5207 


5208 


5209 


5210 


5211 


5212 


5213 


5214 


5215 


5216 


5217 


5218 


5219 


5220 


5221 


5222 


5223 


5224 


5225 


SEQ 
NO. 
(DNA) 


CD 
O 


o 


CO 

o 


CD 
O 


o 




CN 

T — 


CO 

T — 




to 


CD 




CO 


CD 

T — 


o 

CN 


CN 


CN 
CN 


CO 
CN 


CN 


LO 
CN 



220 



03 



Function 


hypothetical protein 


nitrogen fixation protein 


ABC transporter ATP-binding protein 


hypothetical protein 


ABC transporter 


DNA-binding protein 


hypothetical membrane protein 


ABC transporter 


hypothetical protein 


hypothetical protein 




helicase 


quinone oxidoreductase 


cytochromeo ubiquinol oxidase 
assembly factor / heme O 
synthase 


transketolase 


transaldolase 




Matched 
length 
(aa) 


CN 
in 


■si- 


CN 

LO 
CN 


r-- 
co 


CO 
CO 


T — 

CN 


CO 

to 


CO 


CD 
CD 
CN 


CD 
CN 




CO 
N" 


CO 
CN 
CO 


LO 
CD 
CN 


LO 
CD 


CO 
LO 

CO 




CO 


o 




CO 


o 


o 




CO 


CO 


CO 


CD 




O 


CO 


CO 


o 


CN 






to 


CO 


CD 
CO 


CO 
CO 


CO 


r— 


CD 










LO 


o* 
r-- 


CO 
CO 


d 
o 


to 

CO 




CO 




































Identity 
(%) 


48.0 


64.7 


70.2 


55.2 


41.0 


46.1 


36.3 


50.2 


o 


43.0 




23.4 


37.5 


37.6 


100.0 


62.0 




Hoinologous gene 


Aeropyrum pernix K1 APE2025 


Mycobacterium leprae nifS 


Streptomyces coelicolor A3(2) 
SCC22.04c 


Mycobacterium tuberculosis 
H37Rv Rv1462 


Synechocystis sp, PCC6803 
slr0074 


Streptomyces coelicolor A3(2) 
SCC22.08c 


Mycobacterium tuberculosis 
H37Rv Rv1459c 


Mycobacterium leprae 
MLCL536.31 abc2 


Mycobacterium leprae 
MLCL536.32 


Mycobacterium tuberculosis 
H37Rv Rv1456c 




Pyrococcus horikoshii PH0450 


Escherichia coli K12 qor 


Nitrobacter winogradskyi coxC 


Corynebacterium glutamicum 
ATCC 31833 tkt 


Mycobacterium leprae 
MLCL536.39tal 




db Match 


PIR:C72506 


CD 
h~ 
CN 

r— 
CO 

"a. 


gp:SCC22_4 


pir:A70872 


sp:Y074_SYNY3 


CO 

CN 
O 
O 
CO 
H. 
CO 


pir:F70871 


pir:S72783 


pir:S72778 


pir:C70871 




pir;C71156 


sp:QOR_ECOLI 


gp:NWCOXABC_3 


gp:AB023377_1 


LJJ 

_j 
o 
>- 

J 

|5 

bl 






CN 
CO 


1263 


CO 

to 


1176 


1443 


CO 
CO 
CO 


1629 


1020 


o 

CO 


CD 
CD 
CO 


LO 

CO 


1629 


LO 
CO 


CO 
CD 
CO 


2100 


1080 


1164 


Terminal 
(nt) 


1648709 


1648100 


1649367 


1650249 


1651433 


1652894 


1655671 


1656700 


1657515 


1658675 


1659140 


1661136 


1662552 


1662630 


: 1666502 


1667752 


1666601 


Initial 
(nt) 


1648548 


1649362 


1650122; 


1651424 


1652875 


1653586 


1654043 


i 

1655681 


1656712 


1657677 


1659496 


1659508 


1661578 


1663598 


1664403 


1666673 


1667764 


SEQ 

NO 

(a.a.) 


5226 


5227 


5228; 


5229 


5230 


5231 


5232! 


5233 


5234 


5235 


5236 


5237 


5238 


5239 


5240 


5241 


5242 


a ^ < 

LU § 2 


CD 
CN 


r-- 
CN 


CO 

CN 
r-- 


CO 
CN 


o 

CO 


CO 

1^- 


CN 
CO 

r-- 


CO 
CO 


CO 


LO 
CO 
h- 


CD 
CO 


CO 


CO 
CO 


CD 
CO 


o 




CN 
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o 

c 
LL 



<d x: _ 

CC5 CD — ' 




o 

3 



_CD 



CO 



o 



o 
X 



CD 

To 

cl 

0) CD 

O CO 
X (0 

Q_ C 
I CD 
CD CD 

<D O 



CD 



CO 



o 
o 



a: 

O ' 



e ; 

CD 



CO ^ 



co ^ 3> 



a rS < 

co ^ a 



to 
> 

CD 



CN 
CO 



CD 
CD 
CO 



CD 

co 
CD 



CO 
CN 



CO 

CD CD 

en co 

O CO 

o c 

3 CD 

So 

.E "a c 

• s >, 

o cx E 

Q_ CO Q) 
Q_ O CO 

O Q. CO 



CD 

o 



to 
o 

Z3 

o < 

l! 



.it 
P 

X) > 

o cn 
o rv. 
>*co 
^ X 



o 
o 



CD 

o 



x: 



CO 
LO 

CN 



oo 
LO 



oo 
CN 



O 

< 



LO 
CD 



CO 

o 
r*- 
CD 



CD 
CD 
CD 



CO 



CN 

m 
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Function 


orotidine-5-phosphate 
decarboxylase 


carbamoyl-phosphate synthase 
large chain 


carbamoyl-phosphate synthase 
small chain 


dihydroorotase 


aspartate carbamoyltransferase 


phosphoribosyl transferase or 
pyrimidine operon regulatory protein 


cell division inhibitor 








N utilization substance protein B 
(regulation of rRNA biosynthesis by 
transcriptional antitermination) 


elongation factor P 


cytoplasmic peptidase 


3-dehydroquinate synthase 


shikimate kinase 


type IV prepilin-like protein specific 
leader peptidase 
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length 
(aa) 
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Similarity 
(%) 


73.6 


77.5 


70.1 


67.7 


79.7 


80.1 


73.4 








69.3 


98.4 


100.0 


99.7 


100.0 


54.9 


Identity 
(%) 


51.8 


53.1 


45.4 


42.8 


48.6 


54,0 


39.7 








33.6 


97.9 


99.5 


98.6 


100.0 


35.2 


Homologous gene 


Mycobacterium tuberculosis 
H37Rv uraA 


Escherichia coli carB 


Pseudomonas aeruginosa 
ATCC 15692 carA 


Bacillus caldolyticus DSM 405 
pyrC 


Pseudomonas aeruginosa 
ATCC 15692 


Bacillus caldolyticus DSM 405 
pyrR 


Mycobacterium tuberculosis 
H37RvRv2216 








Bacillus subtilis nusB 


Brevi bacterium lactofermentum 
ATCC 13869 efp 


Corynebacterium glutamicum 
AS019pepQ 


Corynebacterium glutamicum 
AS01 9 aroB 


Corynebacterium glutamicum 
AS019aroK 


Aeromonas hydrophila tapD 


db Match 


sp:DCOP_MYCTU 


pir:SYECCP 


LU 
< 
LU 
CO 

0_ 

<' 

cc 
< 
0 

bl 
tn 


1 

O 
0 
< 

CO 

1 

O 
ct 
> 

0_ 

bl 
tn 


LU 
< 
LU 
CO 
CL 

1 

CD 
CsL 
> 
CL 

b_ 
Ifi 


sp:PYRR_BACCL 


ZD 

O 
> 

O 
O 
>- 
CL 
CO 








ZD 
CO 

0 
< 

CD 

CD ! 
CO 
3 
ZZ 
cL 
to 


S 

LU 
LT 
CQ 

I 

o_ 

LL 
UJ 
bl 

CO 


gp:AF124600_4 


CO 

I 

0 

0 

CO 

xr 
CN 

LL 
< 
ol 
O) 


gp:AF1246O0_2 


>- 

X 
OI 
LU 

«. 

CO 
Q_ 
LU 
— I 
b. 

CO 


ORF 
(bp) 


xr 

CO 
CO 


3339 


1179 


1341 


CO 

CO 
CD 


CD 

LO 


1164 


xr 


CN 
CO 

xr 


0 

CN 


v~ 
CO 
CO 


CD 
LO 


1089 


1095 


CN 
CD 

xr 
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1704359 


1707706 


1709017 


1710413 


; 1711352 


1713759 


1714306 


1714760 


1714950 


1715382 


1716132 


1716780 


1717938 


1719107 


1720971 


Initial 
(nt) 


1704350 


1707697 


1708884 


1710357 


1711348 


1711927 


1712596 


1713830 


1714299 


1714741 


1716062 


1716692 


1717868 


1719032 


1719598 


1721381 
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Function 


bacterial regulatory protein, arsR 
family 


ABC transporter 




iron(lll) ABC transporter, 
periplasmic-binding protein 


ferrichrome transport ATP-binding 
protein 


shikimate 5-dehydrogenase 


hypothetical protein 


hypothetical protein 


alanyl-tRNA synthetase 


hypothetical protein 




aspartyl-tRNA synthetase 


hypothetical protein 


glucan 1,4-alpha-glucosidase 


phage infection protein 




transcriptional regulator 
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length 
(aa) 
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Identity 
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35.9 
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38.3 


50.O 


41.8 


52.8 


43.3 


65.4 




71.1 


46.1 


26.1 


23.1 




29.2 


Homologous gene 


Streptomyces coelicolor A3(2) 
SC1A2.22 


Corynebacterium diphtheriae 
hmuU 




Pyrococcus abyssi Orsay 
PAB0349 


Bacillus subtilis 168fhuC 


Mycobacterium tuberculosis 
H37Rv aroE 


Mycobacterium tuberculosis 
H37Rv Rv2553c 


Mycobacterium tuberculosis 
H37Rv Rv2554c 


Thiobacillus ferrooxidans ATCC 
33020 alaS 


Mycobacterium tuberculosis 
1 H37Rv Rv2559c 




Mycobacterium leprae aspS 


Mycobacterium tuberculosis 
H37Rv Rv2575 


Saccharomyces cerevisiae 
S288C YIR019C stal 


Bacillus subtilis yhgE 




Streptomyces coelicolor A3(2) 
SCE68.13 
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1725439 
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1727170 


1730048 


1731542 
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SEQ 
NO. 
(a.a.) 


5294 


5295 


5296 


5297 


5298 


5299 


5300 


5301 


5302 


5303 


5304 


5305 


5306 


5307 


5308 


5309 


5310 


a rS < 


co 


LO 

CO 


CO 

CO 


N~ 

CO 


CO 
CD 


CD 
CD 


o 
o 

CO 


o 

CO 


CM 

o 

CO 


CO 

o 

CO 


o 

CO 


in 
o 
co 


CO 

o 

CO 


o 

CO 


CO 

o 

CO 


CD 
O 
CO 


o 

CO 



-225 



Function 




oxidoreductase 




NADH-dependent FMN reductase 


L-serine dehydratase 




alpha-glycerolphosphate oxidase 


histidyl-tRNA synthetase 


hydrolase 


cyclophilin 




hypothetical protein 




GTP pyrophosphokinase 


adenine phosphoribosyltransferase 


dipeptide transport system 


hypothetical protein 


protein-export membrane protein 
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98.8 


60.9 
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Identity 
(%) 




72.8 




37.1 


46.8 




28.4 


43.2 


40.3 


35.4 




98.4 




99.9 


99.5 


98.0 


30.7 


25.9 




Homologous gene 




Streptomyces coelicolor A3 (2) 
SCE15.13c 




Pseudomonas aeruginosa PA01 
slfA 


Escherichia coli K12 sdaA 




Enterococcus casseliflavus glpO 


Staphylococcus aureus 
SR17238 hisS 


Campylobacter jejuni 
NCTC11168 Cj0809c 


Streptomyces chrysomallus 
sccypB 




Corynebacterium glutamicum 
ATCC 13032 orf4 




Corynebacterium glutamicum 
ATCC 13032 ret 


Corynebacterium glutamicum 
ATCC 13032 apt 


Corynebacterium glutamicum 
ATCC 13032 dciAE 


Mycobacterium tuberculosis 
H37Rv Rv2585c 


Escherichia coli K12secF 
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CO 
CO 



CO 
CO 
CO 



o 



co 



LO 

co 



CO 
co 



I s - 



227 



_CD 



Function 












puromycin N-acetyltransferase 






















ferric transport ATP-binding protein 










pantothenate metabolism 
flavoprotein 






Matched 
length 
(aa) 












o 

CO 






















CM 

o 

CM 










CO 
CM 






Similarity 
(%) 












64.2 






















28.7 










cd 

CD 






identity 
(%) 












36.3 






















28.7 










27.1 







Homologous gene 












Streptomyces anulatus pac 






















Actinobacillus 
pleuropneumoniae afuC 










Zymomonas mobilis dfp 






db Match 












sp;PUAC_STRLP 






















CL 

o 

5 

LL 
< 

to 










o 

"l 

CD 
CO 
CO 

CO 
CO 
O 
LL 
< 

cL 

CO 






ORF 
(bp) 


CO 

1^ 

CO 


co 
in 
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ID 

T— 

CO 


CO 
CO 
CO 


r^- 
co 

LO 


1086 
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CO 
CO 
CO 


2580 


1113 


1923 


CO 
CO 


CO 
CO 


CM 

T 

CO 


CO 
CM 


CD 

LO 


CO 
CO 
CO 


CO 
LO 


1107 


o 
CM 


LO 


T 
CO 

CO 


o 
CM 


Terminal 
(nt) 


1777646 


1778037 


1778102 


1779554 


1780507 


1781019 


1782790 


1784381 


1783382 


1782894 


1785732 


1786907 


1789562 


1789768 


1790057 


1790461 


1792438 


1793426 


1793496 


1794820 


1795621 


1796181 


1797049 


1797769 


Initial 
(nt) 


1777269 


1777444 


1779508 


1780168 


1780905 


1781585 


1781705 


1783281 


1784080 


1785473 


1786844 


1788829 


1789080 


1789580 


1789746 


1790889 


1791842 


1792428 


1793654 


1793714 


1795202 


1795591 


1796186 


1797350 


So 3 

CO ^ ^ 


5348 


5349 


5350 


5351 


5352 


5353 


5354 


5355 


5356 


5357 


5358 


5359 


5360 


5361 


5362 


5363 


5364 


5365 


5366 


5367 


5368 


5369 


5370 


5371 
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CO 
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LO 
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CO 
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to 
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CO 


CM 

to 
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CO 
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LO 
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CO 
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co 
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CO 
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CO 
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CO 
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Function 






































transposon TN21 resolvase 






protein-tyrostne phosphatase 






Matched 
length 
(aa) 






































CO 

CO 
T— 






CD 

T 
























































is 5T 






































78.0 






51.8 






CO 










































































































































LD 






29.3 






Homologous gene 






































Escherichia coli tnpR 






Saccharomyces cerevisiae 
S288C YIR026C yvhl 






db Match 
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T 


LO 
CO 

r*- 
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CO 
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CN 


CO 

CD 
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CO 

CN 


CO 
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o 
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OO 


o 

CO 


CO 

CD 
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CO 

CN 


LO 

co 


CM 
CD 
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to 

CO 
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CD 
CN 


CO 
CN 


Terminal 
(nt) 


1797850 


1798023 ! 


1799406 | 


1800366 


1 800449 


1801307 


1 802096 


1802155 


1803419 


1803893 


1804598 


1804865 


1805599 


1 806686 


1807396 


1808113 


1808421 


1 808832 


1810372 


1811545 


1811938 


1812691 


1813606 


1812460 


Initial 
(nt) 


1797969 


1798757 


1799182 


1799473 


1800604 


1800834 


1801344 


1802577 


1802733 


1803465 


1804134 


1804629 


1804919 


1805727 


1806917 


1807433 


1808137 


1808458 


1809761 


I 1810541 


1811564 


1812215 


1812881 


1812882 
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5375 
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5381 
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5389 
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5394 
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CO 
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CO 


CO 

CO 


CO 
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CO 
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CO 


o 
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CO 
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CO 
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CO 
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CO 
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CO 
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CO 


CO 
CO 


CO 
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CO 


oo 
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CO 


o 

CD 
CO 


oo 

CO 


CN 
CD 

CO 


CO 
CD 
CO 


CD 
CO 


to 

CD 
CO 
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Z3 



0) 
-O 



Function 


sporulation transcription factor 


















hypothetical protein 










hypothetical protein 


insertion element (1S3 related) 


insertion element (IS3 related) 






single-stranded-DNA-specific 
exonuclease 




primase 


Matched 
length 
(aa) 


CD 
CM 


















to 
to 










CO 
CD 


CO 
CD 
CM 


o 






CM 
CM 
CD 




CO 


milarity 
(%) 


65.7 


















55.2 










75.0 


95.6 


84.2 






50.6 




64.3 
















































Identity 
(%) 


34.3 


















22.6 










63.0 


87.9 


72.3 






24.0 




31.8 


Homologous gene 


Streptonnyces coelicolor A3(2) 
whiH 


















Thermotoga maritima MSB8 
TM1189 










Corynebacterium glutamicum 


Corynebacterium glutamicum 


Corynebacterium glutamicum 
orfl 






Erwinia chrysanthemi recJ 




Streptococcus phage phi-O1205 
ORF13 


db Match 


gp:SCA32WHIH_6 


















pir:C72285 










CD 
CO 

o 
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CO 
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Q_ 
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pir:S60889 
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CO 
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CD 
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CO 


CM 

co 
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CO 


o 

CM 
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CD 
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CM 
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LO 
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CM 
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CM 
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O 
CO 
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Terminal 
(nt) 


1814517 


1815651 


1816128 


1816636 


1817803 


1818219 


1818774 


1819166 


I 1819748 


1820181 


1824322 


1824589 


1824927 


1825178 


1826557 


1825751 


1826644 


1829688 


1832063 


1834044 


1834149 


1838324 


Initial 
(nt) 


1813780 


1814863 


1815673 


1816451 


1817132 


1817803 


1818460 


1818798 


1819954 


1822382 


1822577 


1824371 


1824784 


1825606 


1826024 


1826644 


1826937 


1829900 


1830765 


1832167 


1834928 


1836675 


SEQ 

NO 

(a.a.) 


5396 


5397 


5398 


5399 


5400 


5401 


5402 


5403 


5404 


5405 


5406 


5407 


5408 


5409 


5410 


5411 


5412 


5413 


5414 


5415 


5416 


5417 


O rS < 


CD 
CO 
CO 


CD 
CO 


CO 
CD 
CO 


CD 
CD 
CO 
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o 

CD 
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O 
CD 


CO 

o 

CD 


O 
CD 
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CD 
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O 
CD 


o 

CD 


CO 

o 

CD 
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CD 
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LO 
CD 
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CD 
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ZJ 

c 



0) 
-Q 



Function 








helicase 




phage N15 protein gp57 




















actin binding protein with SH3 
domains 










ATP/GTP binding protein 




ATP-dependent CIp proteinase ATP- 
binding subunit 


Matched 
iength 
(aa) 








o 

CM 
CD 




CO 

o 




















CM 
CN 










co 




o 

CO 
CD 


Similarity 
(%) 








44.7 




64.2 




















49.8 










52.5 




61.0 


Identity 
(%) 








22.1 




36.7 




















28.7 










23.6 




30.2 


Homologous gene 








Mycoplasma pneumoniae ATCC 
29342 yb95 




Bacteriophage N15 gene57 




















Schizosaccharomyces pombe 
SPAPJ760.02C 










Streptomyces coelicolor 
SC5C7.14 




Escherichia coli K12 clpA 


db Match 
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1842137 


1842681 


1843337 


1845356 


1845857 


1 846207 


1846333 


1847932 


1848474 


1849036 


1849785 


1849966 


1850406 ; 


1849978 


1850474 


1852440 


1852324 


1853873 


1854854 


1855237 


1856788 


1858738 


1860727 


Initial 
(nt) 


1838349 


1842235 


1842804 


1843518 


1845483 


1845872 


1846698 


1847315 


1847938 


1848509 


1848988 


1849781 


1850035 


1850415 


1851049 


1851220 


1851473 


1852479 


1854261 


1855058 


1855532 


1856885 


1858763 
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NO. 
(a.a.) 
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5423 
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CO 
CO 
CO 


CO 
CO 
CO 


o 
co 



Function 










ATP-dependent helicase 










hypothetical protein 


deoxynucleotide monophosphate 
kinase 










type II 5-cytosoine 
methyltransferase 


type II restriction endonuciease 






hypothetical protein 




Matched 
length 










CO 
CD 
CD 










CN 
CN 


CO 

0 

CN 










CO 
CO 
CO 


CO 
LO 
CO 






0 

LO 




Similarity 
(%) 










45.9 










47.8 


LO 
^— 

CO 










99.7 


r-~ 
O) 

CD 






45.8 




Identity 
(%) 










CN 










25.9 


! 

31.7 










99.2 


99.7 






24.6 




Homologous gene 










Staphylococcus aureus SA20 
pcrA 










Streptomyces coelicolor A3(2) 
SCH17.07c 


Bacteriophage phi-C31 gp52 










Corynebacterium glutamicum 
ATCC 13032 cgllM 


Corynebacterium glutamicum 
ATCC 13032 cgllR 






Streptomyces coelicolor A3(2) 
SC1A2.16C 




db Match 
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gp:SC1A2_16 
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CN 
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CN 
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1521 
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CD 
CO 


Terminal 
(nt) 


1861225 


1861475 


1861519 


1862399 


1865299 


1865822 


1866219 


1866792 


1867095 


1867874 


1868587 


1868671 


1868927 


1871101 


1871380 


1879400 


1880485 


1882470 


1884220 


1887047 


1887590 


Initial 
(nt) 


1860752 


1861320 


1861842 


1862088 


1862945 


1865265 


1865842 


1866328 


1866832 


1867098 


1867886 


1868895 


1871092 


1871373 


1877886 


1878312 


1879412 


1883990 
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1885230 


1887405 
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5455 


5456 
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5461 
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to 
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CO 
LO 
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CO 
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LO 

CO 
CO 
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Function 




















submaxillary apomucin 






modification methylase 










hypothetical protein 






hypothetical protein 








Matched 
length 
(aa) 




















1408 






S 










t — 

T 






CO 
CN 
CO 








Similarity 
(%) 




















49.2 






65.6 










58.8 






54.6 








Identity 
(%) 




















23.2 






42.6 










38.6 






27.1 








Homologous gene 




















Sus scrofa domestica 






Escherichia coli ecoR1 










Mycobacterium tuberculosis 
H37Rv Rv1956 






Methanococcus jannaschii 
MJ0137 








db Match 




















pir:T03099 






sp;MTE1_ECOU : 










pir:H70638 
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i- 

UJ 

.J 

CO 
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CO 










o 

CD 

CO 
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CO 
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CO 
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CO 
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CD 
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CN 


co 
to 


Terminal 
(nt) 


1916733 


1917165 


1917329 


1917564 


1918703 


1919646 


1920347 


1925695 


1926038 


1921547 


1926259 


1927245 


1928381 


1928908 


1929059 


1930990 


1931421 


1931935 


1932373 


1933522 


1934971 


1936849 


1937411 


1937486 


Initial 
(nt) 


1916374 


1916944 


1917640 


1918208 


1919461 


1920194 


1921276 


1925390 


1925682 


1926010 


1926837 


1928189 


1928211 


1928534 


1930879 


1931190 


1931888 


1932315 


1932879 


1934358 


1935912 


1936226 


1937202 


1938019 


CO ^ ^ 
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5487 


5488 


5489 
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5492 


5493 


5494 


5495 


5496 


5497 


5498 


5499 


5500 


5501 
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5503 
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CD 
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to 
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Function 


bifunctiona! protein (riboflavin kinase 
and FAD synthetase) 


tRNA pseudouridine synthase B 


hypothetical protein 


hypothetical protein 


phosphoesterase 


DNA damaged inducible protein f 


hypothetical protein 


ribosome-binding factor A 


translation initiation factor IF-2 


hypothetical protein 


n-utilization substance protein 
(transcriptional 

termination/antitermination factor) 




hypothetical protein 


peptide-binding protein 


peptidetransport system permease 


oligopeptide permease 


peptidetransport system ABC- 
transporter ATP-binding protein 
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46.9 


51.0 


36.7 
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44.6 


42.3 




34.6 


25.3 
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38.4 


57.6 


Homologous gene 


Corynebacterium 
ammoniagenes ATCC 6872 ribF 


Bacillus subtilis 168 truB 


Corynebacterium 
ammoniagenes 


Streptomyces coelicolor A3(2) 
SC5A7.23 


Mycobacterium tuberculosis 
H37RV Rv2795c 


Mycobacterium tuberculosis 
H37Rv Rv2836c dinF 


Mycobacterium tuberculosis 
H37Rv Rv2837c 


Bacillus subtilis 168 rbfA 


Stigmatella aurantiaca DW4 infB 


Streptomyces coelicolor A3(2) 
SC5H4.29 


Bacillus subtilis 168 nusA 




Mycobacterium tuberculosis 
H37Rv Rv2842c 


Bacillus subtilis 168 dppE 


Escherichia coli K12dppB 


Bacillus subtilis spoOKC 


Mycobacterium tuberculosis 
H37Rv Rv3663c dppD 
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Function 


prolyl-tRNA synthetase 

— — 


hypothetical protein 


magnesium-chelatase subunit 


magnesium-chelatase subunit 


uroporphyrinogen III 
methyltransferase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


glutathione reductase 










methionine aminopeptidase 


penicillin binding protein 


response regulator (two-component 
system response regulator) 
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histidine kinase 
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Homologous gene 


Mycobacterium tuberculosis 
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Streptomyces coelicolor A3(2) 
SCC30.05 


Rhodobacter sphaeroides ATCC 
17023 bchD 


Hetiobacillus mobilis bchl 


Propionibacterium freudenreichii 
cob A 


Clostridium perfringens NCIB 
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Escherichia coli K12map 
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Function 


ABC transporter 




hypothetical protein (gcpb protein j 




hypothetical membrane protein 


polypeptides can be used as 
vaccines against Chlamydia 
trachomatis 


1 -deoxy-D-xylulose-5-phospnate 
reductoisomerase 








ABC transporter ATP-binding protein 


pyruvate formate-lyase 1 activating 
enzyme 


hypothetical membrane protein 
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ribosome recycling factor 


uridylate kinase 
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30S ribosomal protein S2 | 
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Function 


transcriptional accessory protein 


sporulation-specific degradation 
regulator protein 


dicarboxylase translocator 


2-oxoglutarate/malate translocator 


3-carboxy-cis.cis-muconate 
cycloisomerase 








tRNA(guanine-NI)- 
methyltransferase 


hypothetical protein 


16S rRNA processing protein 


hypothetical protein 


30S ribosomal protein S16 


inversin 


ABC transporter 


ABC transporter 


signal recognition particle protein 








cell division protein | 
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Function 






glucan 1,4-alpha-glucosidase or 
glucoamylase S1/S2 precursor 




chromosome segregation protein 


acylphosphatase 




transcriptional regulator 


hypothetical membrane protein 






cation efflux system protein 


formamidopyrimidine-DNA 
glycosylase 


ribonuclease 111 


hypothetical protein 


hypothetical protein 


transport protein 


ABC transporter 


hypothetical protein 
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Function 


hypothetical protein 


peptidase 


sucrose transport protein 






maltodextrin phosphorylase/ 
glycogen phosphorylase 


hypothetical protein 


prolipoprotein diacylglyceryl 
transferase 


indole-3-glycerol-phosphate 
synthase / anthranilate synthase 
component 11 


hypothetical membrane protein 


phosphoribosyl-AMP cyclohydrolase 


cyclase 


inositol monophosphate 
phosphatase 


phosphoribosylformimmo-5- 
aminoimidazole carboxamide 
ribotide isomerase 


glutamine amidotransferase 


chloramphenicol resistance protein 
or transmembrane transport protein 
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Homologous gene 
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Function 


isopentenyl-diphosphate Delta- 
isomerase 












beta C-S lyase (degradation of 
aminoethylcysteine) 


branched-chain amino acid transport 
system carrier protein (isoleucine 
uptake) 


alkanal monooxygenase alpha chain 




malonate transporter 


glycolate oxidase subunit 


transcriptional regulator 




hypothetical protein 




heme-binding protein A precursor 
(hemin-binding lipoprotein) 


oligopeptide ABC transporter 
(permease) 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 


Matched 
length 
(aa) 


Gi 

CO 
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LO 
CM 
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CO 












































CO 














CO 


CD 




CO 


h- 


CD 
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CD 
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CM 




CM 


d 


CO 
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Homologous gene 


Chlamydomonas reinhardtii ipM 












Corynebacterium glutamicum 
ATCC 13032 aecD 


Corynebacterium glutamicum 
ATCC 13032 brnQ 


Vibrio harveyi luxA 




Sinorhizobium meliloti mdcF 


Escherichia coli K12 glcD 


Escherichia coli K12 ydfH 




Salmonella typhimurium ygiK 




Haemophilus influenzae Rd 
HI0853hbpA 


Bacillus subtilis 168 appB 


Escherichia coli K12 dppC 


Escherichia coli K12 oppD 
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(nt) 


2441005 
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2441602 


2443356 


2444033 


2445709 


2446993 


2447998 


2450323 


2450859 


2451794 


2455435 


2455452 


2455720 


2457337 


2459371 


2460336 


2461167 


2462599 
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(nt) 


2441589 


2441669 


2442355 


2443356 


2444015 


2444551 


2444735 


2445716 


2447021 


2450844 


2451785 


2454637 


2454725 


2455733 


2457066 


2457759 


2457863 


2459371 


2460340 


2461163 
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Function 


hypothetical protein 


hypothetical protein 


ribose kinase 


hypothetical membrane protein 




sodium-dependent transporter or 
odium Bile acid symporter family 


apospory-associated protein C 




thiamine biosynthesis protein x 


hypothetical protein 


glycine betaine transporter 








large integral C4-dicarboxylate 
membrane transport protein 


small integral C4-dicarboxylate 
membrane transport protein 


C4-dicarboxylate-binding 
periplasmic protein precursor 


extensin I 


GTP-binding protein 
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39.9 
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28.5 
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42.6 


39.8 








34.6 


33.9 


28.2 


63.0 


58.7 


Homologous gene 


Aeropyrum pernix K1 APE1580 


Aquifex aeolicus VF5 aq_768 


Rhizobtum etli rbsK 


Streptomyces coelicolor A3(2) 
SCM2.16c 




Homo sapiens 


Chlamydomonas reinhardtii 




Corynebacterium glutamicum 
ATCC 13032 thiX 


Mycobacteriophage D29 66 


Corynebacterium gtutamicum 
ATCC 13032 betP 








Rhodobacter capsulatus dctM 


Klebsiella pneumoniae dctQ 


Rhodobacter capsulatus B10 
dctP 


Lycopersicon esculentum 
(tomato) 


Bacillus subtilis 168 lepA 
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Function 


xanthine permease 


2 ( 5-diketo-D-gluconic acid reductase 






50S ribosomal protein L27 


SOS ribosomal protein L21 


ribonuclease E 








hypothetical protein 


transposase (insertion sequence 
IS31831) 


hypothetical protein 


hypothetical protein 


nucleoside diphosphate kinase 




hypothetical protein 


hypothetical protein 


hypothetical protein 


Matched 
length 
(aa) 
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(%) 
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82.2 


56.6 








82.6 


100.0 


76.9 


67.8 


89.6 




67.4 


CO 
CO 
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Identity 
(%) 


39.1 


61.2 






80.3 


56.4 


30.1 








61.0 


99.1 


51.3 


37.8 


70.9 




34.8 


36.6 


33.9 


Homologous gene 


Bacillus subtilis 168 pbuX 


Corynebacterium sp. ATCC 
31090 






Streptomyces griseus 1F013189 
rpmA 


Streptomyces griseus IF013189 
obg 


Escherichia coii K12 rne 








Streptomyces coelicolor A3(2) 
SCF76.08c 


Corynebacterium glutamicum 
ATCC 31831 


Streptomyces coelicolor A3(2) 
SCF76.08C 


Streptomyces coelicolor A3(2) 
SCF76.09 


Mycobacterium smegmatis ndk 




Deinococcus radiodurans R1 
DR1844 


Mycobacterium tuberculosis 
H37Rv Rv1883c 


Mycobacterium tuberculosis 
H37Rv Rv2446c 
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toluate 1,2 dioxygenase subunit 


toluate 1,2 dioxygenase subunit 


1,2-dihydroxycyclohexa-3,5-diene 
carboxylate dehydrogenase 


regulator of LuxR family with ATP- 
binding site 


transmembrane transport protein or 
4-hydroxybenzoate transporter 


benzoate membrane transport 
protein 


ATP-dependent Clp protease 
proteolytic subunit 2 


ATP-dependent Clp protease 
proteolytic subunit 1 


hypothetical protein 


trigger factor (prolyl isomerase) 
(chaperone protein) 
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Pseudomonas putida plasmid 
pDK1 xylY 


Pseudomonas putida plasmid 
pDK1 xylZ 


Pseudomonas putida plasmid 
pDK1 xylL 


Rhodococcus erythropolis thcG 


Acinetobacter calcoaceticus 
pcaK 


Acinetobacter calcoaceticus 
benE 


Streptomyces coelicoior M145 
cipP2 


Streptomyces coelicolor M145 
clpP1 


Sulfolobus islandicus ORF154 


Bacillus subtilis 168 tig 


Streptomyces coelicolor A3(2) 
SCD25.17 


Nocardia Sactamdurans LC411 
pbp 


Mus musculus Moa1 




Corynebacterium striatum ORF1 




Corynebacterium striatum ORF1 


Corynebacterium striatum ORF1 
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Function 




acetylornithine aminotransferase 


hypothetical protein 


hypothetical membrane protein 


acetoacetyl CoA reductase 


transcriptional regulator, TetR family 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 


ABC transporter ATP-binding protein 


globin 


chromate transport protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 


ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical membrane protein 


alkaline phosphatase | 
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Function 


ferric enterochelin esterase 


lipoprotein 








transposase (IS1207) 






transcriptional regulator 


glutaminase 


sporulation-specific degradation 
regulator protein 




uronate isomerase 




hypothetical protein 


pyrazinamidase/nicotinamidase 


hypothetical protein 


bacterioferritin comigratory protein 


bacterial regulatory protein, tetR 
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DAWLEY KIDNEY 
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phosphate transport system 
regulatory protein 
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Mycobacterium tuberculosis 
H37Rv Rv0821cphoY-2 


Pseudomonas aeruginosa pstB 


Mycobacterium tuberculosis 
H37Rv Rv0830 pstA1 


Mycobacterium tuberculosis 
H37Rv Rv0829 pstC2 


Mycobacterium tuberculosis 
H37Rv phoS2 


Streptomyces coelicoior A3(2) 
SCD84.18c 




Bacillus subtilis 168 bmrU 


Mycobacterium tuberculosis 
H37Rv Rv081 3c 


Solanum tuberosum BCAT2 


Corynebacterium 
ammoniagenes ATCC 6872 
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Function 


hypothetical protein 


hypothetical protein 


hypothetical membrane prol 


hypothetical protein 


5'-phosphoribosyl-N- 
formylglycinamidine synthet 




S'-phosphoribosyl-N- 
formylglycinamidine synthel 


hypothetical protein 




gluthatione peroxidase 


extracellular nuclease 




hypothetical protein 


C4-dicarboxylate transport 


dipeptidyl aminopeptidase 
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length 
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ammoniagenes ATCC 6872 
purQ 


Corynebacterium 
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Aeromonas hydrophila JMP636 
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Salmonella typhimurium LT2 
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Function 


dihydrodipicolinate synthase 


glucokinase 


N-acetylmannosamine-6-phosphate 
epimerase 




sialidase precursor 


L-asparagine permease operon 
repressor 


dipeptide transporter protein or 
heme-binding protein 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 


oligopeptide transport ATP-binding 
protein 


homoserine/homoserin lactone 
efflux protein or lysE type 
translocator 


leucine-responsive regulatory 
protein 




hypothetical protein 


hypothetical protein 


transcription factor 


Matched 
length 
(aa) 
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57.6 


68.6 
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64.3 


78.3 
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62.7 


66.2 




86.2 


71.5 


91.1 
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Identity 
(%) 


28.2 


28.7 


36.4 




24.8 


26.6 


22.5 


31.9 


46.5 


43.4 


28.5 


31.0 




55.9 


46.4 


73.3 


Homologous gene 


Escherichia coli K12dapA 


Streptomyces coeticolor A3(2) 
SC6E1 0.20c glk 


Clostridium perfringens NCTC 
8798 nanE 




Micromonospora viridifaciens 
ATCC 31146 nadA 


Rhizobium etli ansR 


Bacillus firmus OF4dppA 
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Function 


cytchrome b6-F complex iron-sulfur 
subunit (Rieske iron-sulfur protein) 


NADH oxidase or NADH-dependent 
flavin oxidoreductase 


hypothetical membrane protein 


hypothetical protein 


bacterial regulatory protein, arsR 
family ormethylenomycin A 
resistance protein 


NADH oxidase or NADH-dependent 
flavin oxidoreductase 


hypothetical protein 










acetoin(diacetyl) reductase (acetoin 
dehydrogenase) 


hypothetical protein 


di-/tripeptide transpoter 




bacterial regulatory protein, tetR 
family 


hydroxyquinol 1,2-dioxygenase 


Matched 
length 
(aa) 
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CN 


milarity 
(%) 


63.6 


64.3 


74.7 


54.6 


79.4 


64.3 


69.5 










52.9 


84.5 


71.6 




50.5 


62.2 


CO 




































Identity 
(%) 


32.5 


33.3 


43.6 


34.0 


45.1 


33.4 

.... 


31.4 










26.9 


53.5 


34.5 




26.1 


31.7 


Homologous gene 


Chlorobium limicola petC 


Thermoanaerobacter brockii 
nadO 


Escherichia coli K12 yfeH 


Streptomyces coelicolor A3(2) 
SCI11.36c 


Streptomyces coelicolor Plasmid 
SCP1 mmr 


Thermoanaerobacter brockii 
nadO 


Saccharomyces cerevisiae 
ymyO 










Klebsiella terrigena budC 


Mycobacterium tuberculosis 
H37Rv Rv2094c 


Lactococcus lactis subsp. lactis 
dtpT 




Escherichia coli K12acrR 


Acinetobacter calcoaceticus 
catA 
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Terminal 
(nt) 


3245766 


3245822 


3248205 


3249165 


3249187 


3250742 


3251405 


3251466 


3251743 


3252133 


3252316 


3253480 


3253739 


3253824 


3255719 
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3256471 


Initial 
(nt) 


3245317 


3246931 


3247234 


3248392 
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Example 2 

Determination of effective mutation site 

(1) Identification of mutation site based on the comparison 
of the gene nucleotide sequence of lysine-producing B-6 
strain with that of wild type strain ATCC 13032 

Cozrynebacterium glutamlcxxm B-6, which is resistant 
to S- (2 -aminoethyl) cysteine (AEC) , rifampicin, streptomycin 
and 6-azauracil, is a lysine-producing mutant having been 
mutated and bred by subjecting the wild type ATCC 13032 
strain to multiple rounds of random mutagenesis with a 
mutagen , N-methyl-N' -nitro-N-nitrosoguanidine (NTG) and 
screening (Appl. Microbiol. Biotechnol* t 32: 269-273 
(1989)). First, the nucleotide sequences of genes derived 
from the B-6 strain and considered to relate to the lysine 
production were determined by a method similar to the above. 
The genes relating to the lysine production include lysE 
and lysG which are lysine-excreting genes; ddh, dapA, horn 
and lysC (encoding di ami nopime late dehydrogenase , 
dihydropicolinate synthase, homoserine dehydrogenase and 
aspartokinase, respectively) which are lysine -bio synthetic 
genes; and pyc and zwf (encoding pyruvate carboxylase and 
glucose- 6-phosphate dehydrogenase, respectively) which are 
glucose-metabolizing genes. The nucleotide sequences of 
the genes derived from the production strain were compared 
with the corresponding nucleotide sequences of the ATCC 
13032 strain genome represented by SEQ ID NOS : 1 to 3501 and 
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analyzed. As a result, mutation points were observed in 
many genes. For example, no mutation site was observed in 
lysEf lysGf ddh, dapA, and the like, whereas amino acid 
replacement mutations were found in hom f lysC, pyc, zwf, 
and the like. Among these mutation points, those which are 
considered to contribute to the production were extracted 
on the basis of known biochemical or genetic information. 
Among the mutation points thus extracted, a mutation, 
Val59Ala, in horn and a mutation, Pro458Ser, in pyc were 
evaluated whether or not the mutations were effective 
according to the following method, 

(2) Evaluation of mutation, Val59Ala, in horn and mutation, 
Pro458Ser, in pyc 

It is known that a mutation in horn inducing 
requirement or partial requirement for homoserine imparts 
lysine productivity to a wild type strain (Amino Acid 
Fermentation., ed. by Hiroshi Aida et al. , Japan Scientific 
Societies Press) . However, the relationship between the 
mutation, Val59Ala, in horn and lysine production is not 
known. It can be examined whether or not the mutation, 
Val59Ala, in horn is an effective mutation by introducing 
the mutation to the wild type strain and examining the 
lysine productivity of the resulting strain. On the other 
hand, it can be examined whether or not the mutation, 
Pro458Ser, in pyc is effective by introducing this mutation 
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into a lysine -producing strain which has a deregulated 
lysine-bioxynthetic pathway and is free from the pyc 
mutation, and comparing the lysine productivity of the 
resulting strain with the parent strain. As such a lysine- 
producing bacterium, No. 58 strain (FERM BP-7134) was 
selected (hereinafter referred to the "lysine-producing 
No. 58 strain" or the "No. 58 strain") . Based on the above , 
it was determined that the mutation, Val59Ala, in hom and 
the mutation, Pro458Ser, in pyc were introduced into the 
wild type strain of Corynebacterlum grlutamlavan ATCC 13032 
(hereinafter referred to as the "wild type ATCC 13032 
strain" or the "ATCC 13032 strain") and the lysine- 
producing No. 58 strain, respectively, using the gene 
replacement method. A plasmid vector pCES30 for the gene 
replacement for the introduction was constructed by the 
following method, 

A plasmid vector pCE53 having a kanamycin- resistant 
gene and being capable of autonomously replicating in 
Coryneform bacteria (Mol. Gen. Genet., 196: 175-178 (1984)) 
and a plasmid pMOB3 (ATCC 77282) containing a levansucrase 
gene (sacB) of Bacillus sub tills (Molecular Microbiology, 
6: 1195-1204 (1992)) were each digested with Pstl . Then, 
after agarose gel electrophoresis, a pCE53 fragment and a 
2.6 kb DNA fragment containing sacB were each extracted and 
purified using GENE CLEAN Kit (manufactured by BIO 101) . 
The pCE53 fragment and the 2*6 kb DNA fragment were ligated 
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using Ligation Kit ver. 2 (manufactured by Takara Shuzo) , 
introduced into the ATCC 13032 strain by the 
electroporation method (FEMS Microbiology Letters, 65: 299 
(1989) ) , and cultured on BYG agar medium (medium prepared 
by adding 10 g of glucose, 20 g of peptone (manufactured by 
Kyokuto Pharmaceutical) , 5 g of yeast extract (manufactured 
by Difco) , and 16 g of Bactoagar (manufactured by Difco) to 
1 liter of water, and adjusting its pH to 7.2) containing 
25 jig/ml kanamycin at 30°C for 2 days to obtain a 
transformant acquiring kanamycin-resistance , As a result 
of digestion analysis with restriction enzymes, it was 
confirmed that a plasmid extracted from the resulting 
transformant by the alkali SDS method had a structure in 
which the 2.6 kb DNA fragment had been inserted into the 
PstI site of pCE53. This plasmid was named pCES30. 

Next f two genes having a mutation point, horn and 
pyc, were amplified by PCR, and inserted into pCES30 
according to the TA cloning method (Bio Experiment 
Illustrated vol. 3, published by Shujunsha) . Specifically, 
P CES30 was digested with BamHI (manufactured by Takara 
Shuzo), subjected to an agarose gel electrophoresis, and 
extracted and purified using GE NE CLEAN Kit (manufactured by 
BIO 101) . The both ends of the resulting pCES30 fragment 
were blunted with DNA Blunting Kit (manufactured by Takara 
Shuzo) according to the attached protocol. The blunt-ended 
pCES30 fragment was concentrated by extraction with 
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phenol /chloroform and precipitation with ethanol , and 
allowed to react in the presence of Taq polymerase 
(manufactured by Roche Diagnostics) and dTTP at 70°C for 2 
hours so that a nucleotide, thymine (T) , was added to the 
3' -end to prepare a T vector of pCES30 . 

Separately, chromosomal DNA was prepared from the 
lysine-producing B-6 strain according to the method of 
Saito et al. (Biochem. Blophys . Acta, 72: 619 (1963)). 
Using the chromosomal DNA as a template, PCR was carried 
out with Pfu turbo DNA polymelase (manufactured by 
Stratagene) . In the mutated ham gene, the DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7002 and 
7003 were used as the primer set. In the mutated pya gene, 
the DNAs having the nucleotide sequences represented by SEQ 
ID NOS:7004 and 7005 were used as the primer set. The 
resulting PCR product was subjected to agarose gel 
electrophoresis, and extracted and purified using GENE GLEAN 
Kit (manufactured by BIO 101) . Then, the PCR product was 
allowed to react in the presence of Taq polymerase 
(manufactured by Roche Diagnostics) and dATP at 72°C for 10 
minutes so that a nucleotide, adenine (A) , was added to the 
3' -end. 

The above pCES30 T vector fragment and the mutated 
horn gene (1.7 kb) or mutated pyc gene (3.6 kb) to which the 
nucleotide A had been added of the PCR product were 
concentrated by extraction with phenol /chloroform and 
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precipitation with ethanol, and then ligated using Ligation 
Kit ver. 2* The ligation products were introduced into the 
ATCC 13032 strain according to the electroporation method, 
and cultured on BYG agar medium containing 25 (ig/ml 
kanamycin at 30°C for 2 days to obtain kanamycin-resistant 
transformants. Each of the resulting transf ormants was 
cultured overnight in BYG liquid medium containing 25 \xg/ml 
kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a 
result of digestion analysis using restriction enzymes, it 
was confirmed that the plasmid had a structure in which the 
1.7 kb or 3.6 kb DNA fragment had been inserted into pCES30 . 
The plasmids thus constructed were named respectively 
pChom59 and pCpyc458. 

The introduction of the mutations to the wild type 
ATCC 13032 strain and the lysine-producing No* 58 strain 
according to the gene replacement method was carried out 
according to the following method. Specifically, pChom59 
and pCpyc458 were introduced to the ATCC 13032 strain and 
the No. 58 strain, respectively, and strains in which the 
plasmid is integrated into the chromosomal DNA by 
homologous recombination were selected using the method of 
Ikeda et al. (Mlcrohlology 144: 1863 (1998)). Then, the 
stains in which the second homologous recombination was 
carried out were selected by a selection method, making use 
of the fact that the Bacillus subtllls levansucrase encoded 
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by pCES30 produced a suicidal substance (J. of Bacterlo!. , 
174: 5462 (1992)). Among the selected strains, strains in 
which the wild type horn and pyc genes possessed by the ATCC 
13032 strain and the No. 58 strain were replaced with the 
mutated horn and pyc genes, respectively, were isolated. 
The method is specifically explained below. 

One strain was selected from the transf ormants 
containing the plasmid, pChom59 or pCpYc458, and the 
selected strain was cultured in BYG medium containing 20 
fig/ml kanamycin, and pCGll (Japanese Published Examined 
Patent Application No. 91827/94) was introduced thereinto 
by the electroporation method. pCGll is a plasmid vector 
having a spectinomycin-resistant gene and a replication 
origin which is the same as pCE53. After introduction of 
the pCGll f the strain was cultured on BYG agar medium 
containing 20 tig /ml kanamycin and 100 ug/ml spectinomycin 
at 30°C for 2 days to obtain both the kanamycin- and 
spectinomycin-resistant transformant . The chromosome of 

one strain of these transf ormants was examined by the 
Southern blotting hybridization according to the method 
reported by Ikeda et al. (M±aroh±alogy f 144 : 1863 (1998)). 
As a result, it was confirmed that pChom59 or pCpyc458 had 
been integrated into the chromosome by the homologous 
recombination of the Cambell type. In such a strain, the 
wild type and mutated horn or pyc genes are present closely 



- 324 - 



on the chromosome, and the second homologous recombination 
is liable to arise therebetween. 

Each of these transf ormants (having been recombined 
once) was spread on Sue agar medium (medium prepared by 
adding 100 g of sucrose, 7 g of meat extract, 10 g of 
peptone, 3 g of sodium chloride, 5 g of yeast extract 
(manufactured by Difco) , and 18 g of Bactoagar 
(manufactured by Difco) to 1 liter of water, and adjusting 
its pH 7.2) and cultured at 30°C for a day. Then the 
colonies thus growing were selected in each case. Since a 
strain in which the sacB gene is present converts sucrose 
into a suicide substrate, it cannot grow in this medium {J. 
Bacterlol. , 174: 5462 (1992)). On the other hand, a strain 
in which the sacB gene was deleted due to the second 
homologous recombination between the wild type and the 
mutated horn or pyc genes positioned closely to each other 
forms no suicide substrate and, therefore, can grow in this 
medium. In the homologous recombination, either the wild 
type gene or the mutated gene is deleted together with the 
sacB gene. When the wild type is deleted together with the 
sacB gene , the gene replacement into the mutated type 
arises . 

Chromosomal DNA of each the thus obtained second 
recombinants was prepared by the above method of Saito et 
al. PCR was carried out using Pfu turbo DNA polymerase 
(manufactured by Stratagene) and the attached buffer. In 
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the horn gene, DNAs having the nucleotide sequences 
represented by SEQ ID NOS:7002 and 7003 were used as the 
primer set. Also, in the pyc gene was used, DNAs having 
the nucleotide sequences represented by SEQ ID NOS:7004 and 
7005 were used as the primer set. The nucleotide sequences 
of the PCR products were determined by the conventional 
method so that it was judged whether the horn or pyc gene of 
the second recombinant was a wild type or a mutant . As a 
result, the second recombinant which were called HD-1 and 
No. 58pyc were target strains having the mutated horn gene 
and pyc gene, respectively. 

(3) Lysine production test of HD-1 and No. 58pyc strains 

The HD-1 strain (strain obtained by incorporating 
the mutation, Val59Ala, in the horn gene into the ATCC 13032 
strain) and the No. 58pyc strain (strain obtained by 
incorporating the mutation, Pro458Ser, in the pyc gene into 
the lysine-producing No. 58 strain) were subjected to a 
culture test in a 5 1 jar fermenter by using the ATCC 13032 
strain and the lysine-producing No. 58 strain respectively 
as a control. Thus lysine production was examined. 

After culturing on BYG agar medium at 30°C for 24 
hours, each strain was inoculated into 250 ml of a seed 
medium (medium prepared by adding 50 g of sucrose, 40 g of 
corn steep liquor, 8.3 g of ammonium sulfate, 1 g of urea, 
2 g of potassium dihydrogenphosphate , 0.83 g of magnesium 



- 326 - 



sulfate heptahydrate , 10 mg of iron sulfate heptahydrate, 1 
mg of copper sulfate pentahydrate, 10 mg of zinc sulfate 
heptahydrate, 10 mg of (3-alanine, 5 mg of nicotinic acid, 
1.5 mg of thiamin hydrochloride, and 0,5 mg of biotin to 1 
liter of water, and adjusting its pH to 7.2, then to which 
30 g of calcium carbonate had been added) contained in a 2 
1 buffle-attached Erlenmeyer flask and cultured therein at 
30°C for 12 to 16 hours. A total amount of the seed 
culturing medium was inoculated into 1,400 ml of a main 
culture medium (medium prepared by adding 60 g of glucose, 
20 g of corn steep liquor, 25 g of ammonium chloride, 2.5 g 
of potassium dihydrogenphosphate , 0.75 g of magnesium 
sulfate heptahydrate, 50 mg of iron sulfate heptahydrate, 
13 mg of manganese sulfate pentahydrate , 50 mg of calcium 
chloride, 6.3 mg of copper sulfate pentahydrate, 1.3 mg of 
zinc sulfate heptahydrate, 5 mg of nickel chloride 
hexahydrate, 1.3 mg of cobalt chloride hexahydrate , 1.3 mg 
of ammonium molybdenate tetrahydrate , 14 mg of nicotinic 
acid, 23 mg of p-alanine, 7 mg of thiamin hydrochloride, 
and 0.42 mg of biotin to 1 liter of water) contained in a 5 
1 jar fermenter and cultured therein at 32°C, 1 wm and 800 
rpm while controlling the pH to 7.0 with aqueous ammonia. 
When glucose in the medium had been consumed, a glucose 
feeding solution (medium prepared by adding 400 g glucose 
and 45 g of ammonium chloride to 1 liter of water) was 
continuously added. The addition of feeding solution was 
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carried out at a controlled speed so as to maintain the 
dissolved oxygen concentration within a range of 0,5 to 3 
ppm. After culturing for 29 hours, the culture was 
terminated. The cells were separated from the culture 
medium by centrifugation and then L-lysine hydrochloride in 
the supernatant was quantified by high performance liquid 
chromatography (HPLC) . The results are shown in Table 2 
below. 



Table 2 

Strain L-Lysine hydrochloride yield (g/1) 

ATCC 13032 0 
HD-1 8 

No* 58 45 
No. 58pyc 51 



As is apparent from the results shown in Table 2 , 
the lysine productivity was improved by introducing the 
mutation , Val59Ala / in the horn gene or the mutation, 
Pro458Ser, in the pyc gene. Accordingly, it was found that 
the mutations are both effective mutations relating to the 
production of lysine. Strain, AHP-3, in which the mutation, 
Val59Ala, in the horn gene and the mutation, Pro458Ser, in 
the pyc gene have been introduced into the wild type ATCC 
13032 strain together with the mutation, Thr331Ile in the 
lysC gene has been deposited on December 5, 2000, in 
National Institute of Bioscience and Human Technology, 



- 328 - 



Agency of Industrial Science and Technology (Higashi 1-1-3, 
Tsukuba-shi, Ibaraki , Japan) as FERM BP-7382. 

Example 3 

Reconstruction of lysine-producing strain based on genome 
information 

The lysine-producing mutant B-6 strain (Appl. 
Microbiol. Blotechnol. , 32: 269-273 (1989)), which has been 
constructed by multiple round random mutagenesis with NTG 
and screening from the wild type ATCC 13032 strain, 
produces a remarkably large amount of lysine hydrochloride 
when cultured in a jar at 32°C using glucose as a carbon 
source. However, since the fermentation period is long, 
the production rate is less than 2.1 g/l/h. Breeding to 
reconstitute only effective mutations relating to the 
production of lysine among the estimated at least 300 
mutations introduced into the B-6 strain in the wild type 
ATCC 13032 strain was performed. 

(1) Identification of mutation point and effective mutation 
by comparing the gene nucleotide sequence of the B-6 strain 
with that of the ATCC 13032 strain 

As described above, the nucleotide sequences of 
genes derived from the B-6 strain were compared with the 
corresponding nucleotide sequences of the ATCC 13032 strain 
genome represented by SEQ ID N0S:1 to 3501 and analyzed to 
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identify many mutation points accumulated in the chromosome 
of the B-6 strain. Among these, a mutation, Val591Ala, in 
hom, a mutation, Thr311Ile, in lysC , a mutation, Pro458Ser, 
in pyc and a mutation, Ala213Thr, in zwf were specified as 
effective mutations relating to the production of lysine. 
Breeding to reconstitute the 4 mutations in the wild type 
strain and for constructing of an industrially important 
lysine-producing strain was carried out according to the 
method shown below. 

(2) Construction of plasmid for gene replacement having 
mutated gene 

The plasmid for gene replacement, pChom59, having 
the mutated hom gene and the plasmid for gene replacement, 
pCpyc458, having the mutated pyc gene were prepared in the 
above Example 2(2). Plasmids for gene replacement having 
the mutated lysC and zwf were produced as described below. 

The lysC and zwf having mutation points were 
amplified by PGR, and inserted into a plasmid for gene 
replacement, pCES30, according to the TA cloning method 
described in Example 2(2) (Bio Experiment Illustrated, Vol. 
3) . 

Separately, chromosomal DNA was prepared from the 
lysine-producing B-6 strain according to the above method 
of Saito et al. Using the chromosomal DNA as a template, 
PCR was carried out with Pfu turbo DNA polymerase 
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{manufactured by Stratagene) . In the mutated lysC gene, 
the DNAs having the nucleotide sequences represented by SEQ 
ID NOS:7006 and 7007 were used as the primer set. In the 
mutated zvf gene, the DNAs having the nucleotide sequences 
represented by SEQ ID NOS:7008 and 7009 as the primer set. 
The resulting PCR product was subjected to agarose gel 
electrophoresis, and extracted and purified using GENE GLEAN 
Kit (manufactured by BIO 101). Then, the PCR product was 
allowed to react in the presence of Taq DNA polymerase 
(manufactured by Roche Diagnostics) and dATP at 72°C for 10 
minutes so that a nucleotide, adenine (A) , was added to the 
3' -end. 

The above pCES30 T vector fragment and the mutated 
lysC gene (1.5 kb) or mutated zvf gene (2.3 kb) to which 
the nucleotide A had been added of the PCR product were 
concentrated by extraction with phenol /chloroform and 
precipitation with ethanol , and then ligated using Ligation 
Kit ver. 2. The ligation products were introduced into the 
ATCC 13032 strain according to the electroporation method, 
and cultured on BYG agar medium containing 25 |^g/ml 
kanamycin at 30°C for 2 days to obtain kanamycin- resistant 
transf ormants . Each of the resulting transf ormants was 
cultured overnight in BYG liquid medium containing 25 (J.g/ml 
kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a 
result of digestion analysis using restriction enzymes, it 
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was confirmed that the plasmid had a structure in which the 
1.5 kb or 2.3 kb DNA fragment had been inserted into pCES30 . 
The plasmids thus constructed were named respectively 
pClysC311 and pCzwf213. 

(3) Introduction of mutation, Thr311Ile, in lysC into one 
point mutant HD-1 

Since the one mutation point mutant HD-1 in which 
the mutation, Val59Ala, in horn was introduced into the wild 
type ATCC 13032 strain had been obtained in Example 2(2), 
the mutation, Thr311Ile, in lysC was introduced into the 
HD-1 strain using pClysC311 produced in the above (2) 
according to the gene replacement method described in 
Example 2(2) . PCR was carried out using chromosomal DNA of 
the resulting strain and, as the primer set, DNAs having 
the nucleotide sequences represented by SEQ ID NOS:7006 and 
7007 in the same manner as in Example 2(2) . As a result of 
the fact that the nucleotide sequence of the PCR product 
was determined in the usual manner, it was confirmed that 
the strain which was named AHD-2 was a two point mutant 
having the mutated lysC gene in addition to the mutated horn 
gene . 
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(4) Introduction of mutation, Pro458Ser, in pyc into two 
point mutant AHD-2 

The mutation, Pro458Ser, in pyc was introduced into 
the AHD-2 strain using the pCpyc458 produced in Example 
2(2) by the gene replacement method described in Example 
2 (2) . PCR was carried out using chromosomal DNA of" the 
resulting strain and, as the primer set, DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7004 and 
7005 in the same manner as in Example 2 (2) . As a result of 
the fact that the nucleotide sequence of the PCR product 
was determined in the usual manner, it was confirmed that 
the strain which was named AHD-3 was a three point mutant 
having the mutated pyc gene in addition to the mutated horn 
gene and lysC gene . 

(5) Introduction of mutation, Ala213Thr, in zwf into three 
point mutant AHP-3 

The mutation, Ala213Thr, in zwf was introduced into 
the AHP-3 strain using the pCzwf458 produced in the above 
(2) by the gene replacement method described in Example 
2(2), PCR was carried out using chromosomal DNA of the 
resulting strain and, as the primer set, DNAs having the 
nucleotide sequences represented by SEQ ID NOS:7008 and 
7009 in the same manner as in Example 2(2) . As a result of 
the fact that the nucleotide sequence of the PCR product 
was determined in the usual manner, it was confirmed that 



- 333 - 



the strain which was named APZ-4 was a four point mutant 
having the mutated zvf gene in addition to the mutated horn 
gene, lysC gene and pyc gene. 



(6) Lysine production test on HD-1, AHD-2, AHP-3 and APZ-4 
strains 

The HD-1, AHD-2, AHP-3 and APZ-4 strains obtained 
above were subjected to a culture test in a 5 1 jar 
fermenter in accordance with the method of Example 2(3) . 

Table 3 shows the results. 

Table 3 



L-Lysine hydrochloride Productivity 
Strain (g/i) (g/l/h) 



HD-1 8 0.3 

AHD-2 73 2.5 

AHP-3 80 2.8 

APZ-4 86 3.0 



Since the lysine-producing mutant B-6 strain which 
has been bred based on the random mutation and selection 
shows a productivity of less than 2.1 g/l/h, the APZ-4 
strain showing a high productivity of 3.0 g/l/h is useful 
in industry. 

(7) Lysine fermentation by APZ-4 strain at high temperature 
The APZ-4 strain, which had been reconstructed by 
introducing 4 effective mutations into the wild type strain, 
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was subjected to the culturing test in a 5 1 jar fermenter 
in the same manner as in Example 2 (3) , except that the 
culturing temperature was changed to 40°C. 

The results are shown in Table 4 . 



Table 4 


Temperature 
<°C) 


L-Lysine hydrochloride 

(g/D 


Product! vi ty 

(g/i/h) 


32 


86 


3.0 


40 


95 


3.3 


As is 


apparent from the results 


shown in Table 4 , 


the lysine 


hydrochloride titer and 


productivity in 


culturing at 


a high temperature of* 4 0°C comparable to those 


at 32°C were 


obtained. In the mutated 


and bred lysine- 


producing B- 


6 strain constructed by 


repeating random 


mutation and selection, the growth 


and the ly s i ne 


product! vi ty 


are lowered at temperatures 


exceeding 34°C so 



that lysine fermentation cannot be carried out, whereas 
lysine fermentation can be carried out using the APZ-4 
strain at a high temperature of 4 0°C so that the load of 
cooling is greatly reduced and it is industrially useful. 
The lysine fermentation at high temperatures can be 
achieved by reflecting the high temperature adaptability 
inherently possessed by the wild type strain on the APZ-4 
strain . 
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As demonstrated in the reconstruction of the 
lysine-producing strain, the present invention provides a 
novel breeding method effective for eliminating the 
problems in the conventional mutants and acquiring 
industrially advantageous strains. This methodology which 
reconstitutes the production strain by reconstituting the 
effective mutation is an approach which is efficiently 
carried out using the nucleotide sequence information of 
the genome disclosed in the present invention, and its 
effectiveness was found for the first time in the present 
invention. 

Example 4 

Production of DNA mi cr oar ray and use thereof 

A DNA mi cr oar ray was produced based on the 
nucleotide sequence information of the ORF deduced from the 
full nucleotide sequences of CorjueJbacteriuni glut ami cum 
ATCC 13032 using software, and genes of which expression is 
fluctuated depending on the carbon source during culturing 
were searched . 

(1) Production of DNA microarray 

Chromosomal DNA was prepared from Corynebacterlum 
g-lutamlcvm ATCC 13032 by the method of Saito et al. 
(Biochem. Biophys. Acta , 72: 619 (1963)). Based on 24 
genes having the nucleotide sequences represented by SEQ ID 
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NOS:207, 3433, 281, 3435, 3439, 765, 3445, 1226, 1229, 3448, 
3451, 3453, 3455, 1743, 3470, 2132, 3476, 3477, 3485, 3488, 
3489, 3494, 3496, and 3497 from the ORFs shown in Table 1 
deduced from the full genome nucleotide sequence of 
Corynebacterlum glutamlcum ATCC 13032 using software and 
the nucleotide sequence of rabbit globin gene (GenBank 
Accession No, V00882) used as an internal standard, oligo 
DNA primers for PCR amplification represented by SEQ ID 
NOS:7010 to 7059 targeting the nucleotide sequences of the 
genes were synthesized in a usual manner. 

As the oligo DNA primers used for the PCR, 
DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7010 and 7011 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 207, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7012 and 7013 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3433, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7014 and 7015 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:281, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7016 and 7017 were used for the amplification of 



- 337 - 



the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3435, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7018 and 7019 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3439 r 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7020 and 7021 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 7 65, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7022 and 7023 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO : 3445 , 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7024 and 7025 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 1226 , 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7026 and 7027 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 1229, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7028 and 7029 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:3448, 
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DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7030 and 7031 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3451, ^ 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7032 and 7033 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3453, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7034 and 7035 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3455, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7036 and 7037 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO:1743, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7038 and 7039 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3470, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7040 and 7041 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 2132, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7042 and 7043 were used for the amplification of 
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the DNA having the nucleotide sequence represented by SEQ 
ID 1*0:3476, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7044 and 7045 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3477 , 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7046 and 7047 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3485, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7048 and 7049 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 34 88, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7050 and 7051 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 3489, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7052 and 7053 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 34 94, 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7054 and 7055 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 34 96, 
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DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7056 and 7057 were used for the amplification of 
the DNA having the nucleotide sequence represented by SEQ 
ID NO: 34 97, and 

DNAs having the nucleotide sequence represented by 
SEQ ID NOS:7058 and 7059 were used for the amplification of 
the DNA having the nucleotide sequence of the rabbit globin 
gene , 

as the respective primer set. 

The PCR was carried for 30 cycles with each cycle 
consisting of 15 seconds at 95°C and 3 minutes at GB°C 
using a thermal cycler (GeneAmp PCR system 9600, 
manufactured by Perkin Elmer) , TaKaRa EX-Taq (manufactured 
by Takara Shuzo) , 100 ng of the chromosomal DNA and the 
buffer attached to the TaKaRa Ex-Taq reagent. In the case 
of the rabbit globin gene, a single- stranded cDNA which had 
been synthesized from rabbit globin mRNA (manufactured by 
Life Technologies) according to the manufacture's 
instructions using a reverse transcriptase RAV-2 
(manufactured by Takara Shuzo) . The PCR product of each 
gene thus amplified was subjected to agarose gel 
electrophoresis and extracted and purified using QIAquick 
Gel Extraction Kit (manufactured by QIAGEN) . The purified 
PCR product was concentrated by precipitating it with 
ethanol and adjusted to a concentration of 200 ng/|j.l. Each 
PCR product was spotted on a slide glass plate 



- 341 - 



(manufactured by Ma tsunami Glass) having MAS coating in 2 
runs using GTMASS SYSTEM (manufactured by Nippon Laser & 
Electronics Lab . ) according to the manufacture 1 s 
instructions . 

(2) Synthesis of fluorescence labeled cDNA 

The ATCC 13032 strain was spread on BY agar medium 
(medium prepared by adding 20 g of peptone (manufactured by 
Kyokuto Pharmaceutical) , 5 g of yeast extract (manufactured 
by Difco) , and 16 g of Bactoagar (manufactured by Difco) to 
in 1 liter of water and adjusting its pH to 7.2) and 
cultured at 30°C for 2 days. Then, the cultured strain was 
further inoculated into 5 ml of BY liquid medium and 
cultured at 30°C overnight. Then, the cultured strain was 
further inoculated into 30 ml of a minimum medium (medium 
prepared by adding 5 g of ammonium sulfate, 5 g of urea, 
0,5 g of monopotassium dihydrogenphosphate , 0.5 g of 
dipotassium monohydrogenphosphate , 20.9 g of 

morpholinopropanesulf onic acid, 0.25 g of magnesium sulfate 
heptahydrate, 10 mg of calcium chloride dihydrate, 10 mg of 
manganese sulfate monohydrate, 10 mg of ferrous sulfate 
heptahydrate, 1 mg of zinc sulfate heptahydrate, 0.2 mg 
copper sulfate, and 0.2 mg biotin to 1 liter of water, and 
adjusting its pH to 6.5) containing 110 mmol/1 glucose or 
200 mmol/1 ammonium acetate, and cultured in an Erlenmyer 
flask at 30° to give 1.0 of absorbance at 660 nm. After 
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the cells were prepared by centrifuging at 4°C and 5,000 
rpm for 10 minutes, total RNA was prepared from the 
resulting cells according to the method of Bormann et al. 
(Molecular Microbiology, 6: 317-326 (1992)). To avoid 
contamination with DNA, the RNA was treated with Dnasel 
(manufactured by Takara Shuzo) at 37°C for 30 minutes and 
then further purified using Qiagen RNeasy MiniKit 
(manufactured by QIAGEN) according to the manufacture's 
instructions. To 30 fig of the resulting total RNA, 0.6 \il 
of rabbit globin mRNA (50 ng/(il , manufactured by Life 
Technologies) and 1 j-il of a random 6 mer primer (500 ng/^il, 
manufactured by Takara Shuzo) were added for denaturing at 
65°C for 10 minutes, followed by quenching on ice. To the 
resulting solution, 6 |j.l of a buffer attached to 
Superscript II (manufactured by Lifetechnologies) , 3 (il of 
0.1 mol/1 DTT, 1.5 \il of dNTPs (25 mmol/1 dATP, 25 mmol/1 
dCTP, 25 mmol/1 dGTP, 10 mmol/1 dTTP) , 1.5 |j,l of Cy5-dUTP 
or Cy3-dUTP (manufactured by NEN) and 2 \i± of Superscript 
II were added, and allowed to stand at 25°C for 10 minutes 
and then at 42°C for 110 minutes. The RNA extracted from 
the cells using glucose as the carbon source and the RNA 
extracted from the cells using ammonium acetate were 
labeled with Cy5-dUTP and Cy3-dUTP, respectively. After 
the fluorescence labeling reaction, the RNA was digested by 
adding 1.5 jj.1 of 1 mol/1 sodium hydroxide-20 mmol/1 EDTA 
solution and 3.0 \il of 10% SDS solution, and allowed to 
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stand at 65°C for 10 minutes. The two cDNA solutions after 
the labeling were mixed and purified using Qiagen PCR 
purification Kit (manufactured by QIAGEN) according to the 
manufacture's instructions to give a volume of 10 jal . 

(3) Hybridization 

UltraHyb (110 \xl) (manufactured by Ambion) and the 
fluorescence-labeled cDNA solution (10 |il) were mixed and 
subjected to hybridization and the subsequent washing of 
slide glass using GeneTAC Hybridization Station 
(manufactured by Genomic Solutions) according to the 
manufacture's instructions. The hybridization was carried 
out at 50°C, and the washing was carried out at 25°C. 

(4) Fluorescence analysis 

The fluorescence amount of each DNA array having the 
fluorescent cDNA hybridized therewith was measured using 
ScanArray 4000 (manufactured by GSI Lumonics) . 

Table 5 shows the Cy3 and Cy5 signal intensities of 
the genes having been corrected on the basis of the data of 
the rabbit globin used as the internal standard and the 
Cy3/ Cy5 ratios . 



- 344 - 



Table 5 



SEQ ID NO 


Cy3 


intensity 


Cy5 


intensity 


Cy3/Cy5 


207 




5248 






3240 




1.62 


3433 




2239 






2694 




0 .83 


281 




2370 






2595 




0.91 


3435 




2566 






2515 




1.02 


3439 




5597 






6944 




0.81 


765 




6134 






4943 




1.24 


3455 




1169 






1284 




0.91 


1226 




1301 






1493 




0.87 


1229 




1168 






1131 




1.03 


3448 




1187 






1594 




0.74 


3451 




2845 






3859 




0.74 


3453 




3498 






1705 




2.05 


3455 




1491 






1144 




1.30 


1743 




1972 






1841 




1 .07 


3470 




4752 






3764 




1 .26 


2132 




1173 






1085 




1.08 


3476 




1847 






1420 




1.30 


3477 




1284 






1164 




1 . 10 


3485 




4539 






8014 




0.57 


3488 




34289 






1398 




24.52 


3489 




43645 






1497 




29.16 


3494 




3199 






2503 




1 .28 


3496 




3428 






2364 




1 .45 


3497 




3848 






3358 




1.15 


The ORF function data estimated by using software 
were searched for SEQ ID NOS:3488 and 3489 showing 
remarkably strong Cy3 signals. As a result, it was found 
that SEQ ID NOS:3488 and 3489 are a maleate synthase gene 
and an isocitrate lyase gene, respectively. It is known 
that these genes are transcriptionally induced by acetic 
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acid in Corynebacterlum glutamlcvm (Archives of 
Microbiology, 168: 262-269 (1997)). 

As described above, a gene of which expression is 
fluctuates could be discovered by synthesizing appropriate 
oligo DNA primers based on the ORF nucleotide sequence 
information deduced from the full genomic nucleotide 
sequence information of Corynebacterlvm glutamlcvm ATCC 
13032 using software , amplifying the nucleotide sequences 
of the gene using the genome DNA of Corynebacterlvm 
glutamlcvm as a template in the PCR reaction, and thus 
producing and using a DNA microarray. 

This Example shows that the expression amount can 
be analyzed using a DNA microarray in the 24 genes. On the 
other hand, the present DNA microarray techniques make it 
possible to prepare DNA microarrays having thereon several 
thousand gene probes at once. Accordingly, it is also 
possible to prepare DNA microarrays having thereon all of 
the ORF gene probes deduced from the full genomic 
nucleotide sequence of Corynebacterlvm glutamlcvm ATCC 
13032 determined by the present invention, and analyze the 
expression profile at the total gene level of 
CoryneJbacterium glutamicum using these arrays . 
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Example 5 

Homology search using Corynebacterlum glutamlcvm genome 
sequence 

( 1 ) Search of adenosine deaminase 

The amino acid sequence (ADD_ECOLI) of Escherichia 
coll adenosine deaminase was obtained from Swiss-prot 
Database as the amino acid sequence of the protein of which 
function had been confirmed as adenosine deaminase 
(EC3.5.4.4). By using the full length of this amino acid 
sequence as a query, a homology search was carried out on a 
nucleotide sequence database of the genome sequence of 
Coryxiejbacteriuzn glutamlcvm or a database of the amino acids 
in the ORF region deduced from the genome sequence using 
FASTA program (Proc, Natl. Acad. Scl. ISA, 85: 2444-2448 
(1988) ) . A case where E-value was le" 10 or less was judged 
as being significantly homologous. As a result, no 
sequence significantly homologous with the Escherichia coll 
adenosine deaminase was found in the nucleotide sequence 
database of the genome sequence of Corynebacterlum 
glutamlcvm or the database of the amino acid sequences in 
the ORF region deduced from the genome sequence- Based on 
these results, it is assumed that Corynebacterlum 
glutamlcvm contains no ORF having adenosine deaminase 
activity and thus has no activity of converting adenosine 
into inosine. 
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(2) Search of glycine cleavage enzyme 

The sequences (GCSP__ECOLI , GCST_J3C0LI and 
GCSH_ECOLI ) of glycine decarboxylase , aminomethyl 
transferase and an aminomethyl group carrier each of which 
is a component of Escherichia, coll glycine cleavage enzyme 
as the amino acid sequence of the protein, of which 
function had been confirmed as glycine cleavage enzyme 
(EC2 . 1 . 2 . 10) , were obtained from Swiss-prot Database. 

By using these full-length amino acid sequences as 
a query, a homology search was carried out on a nucleotide 
sequence database of the genome sequence of Corynebacterlum 
glutamlcvm or a database of the ORF amino acid sequences 
deduced from the genome sequence using FAST A program. A 
case where E -value was le~ 10 or less was judged as being 
significantly homologous. As a result, no sequence 
significantly homologous with the glycine decarboxylase, 
the aminomethyl transferase or the aminomethyl group 
carrier each of which is a component of Escherichia coll 
glycine cleavage enzyme, was found in the nucleotide 
sequence database of the genome sequence of Corynebsicterium 
glutamlcvm or the database of the ORF amino acid sequences 
estimated from the genome sequence. Based on these results, 
it is assumed that Corynebacterlum glutamlcvm contains no 
ORF having the activity of glycine decarboxylase, 
aminomethyl transferase or the aminomethyl group carrier 
and thus has no activity of the glycine cleavage enzyme. 
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(3) Search of IMP dehydrogenase 

The amino acid sequence (IMDH ECOLI) of Escherichia 
coli IMP dehydrogenase as the amino acid sequence of the 
protein, of which function had been confirmed as IMP 
dehydrogenase (EC1 . 1 . 1 . 205) , was obtained from Swiss-prot 
Database. By using the full length of this amino acid 
sequence as a query/ a homology search was carried out on a 
nucleotide sequence database of the genome sequence of 
Corynebacterivm glutazzzicum or a database of the ORF amino 
acid sequences predicted from the genome sequence using 
FASTA program. A case where E -value was le" 10 or less was 
judged as being significantly homologous . As a result , the 
amino acid sequences encoded by two ORFs , namely, an ORF 
positioned in the region of the nucleotide sequence No. 
615336 to 616853 (or ORF having the nucleotide sequence 
represented by SEQ ID NO: 672) and another ORF positioned in 
the region of the nucleotide sequence No. 616973 to 618094 
(or ORF having the nucleotide sequence represented by SEQ 
ID NO: 674) were significantly homologous with the ORFs of 
Escherichia coli IMP dehydrogenase. By using the above- 
described predicted amino acid sequence as a query in order 
to examine the similarity of the amino acid sequences 
encoded by the ORFs with IMP dehydrogenases of other 
organisms in greater detail , a search was carried out on 
GenBank (http://www.ncbi.nlm.nih.gov/) nr-aa database 
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(amino acid sequence database constructed on the basis of 
GenBankCDS translation products, PDB database, Swiss-Prot 
database, PIR database, PRF database by eliminating 
duplicated registrations) using BLAST program. As a result, 
both of the two amino acid sequences showed significant 
homologies with IMP dehdyrogenases of other organisms and 
clearly higher homologies with IMP dehdyrogenases than with 
amino acid sequences of other proteins, and thus, it was 
assumed that the two ORFs would function as IMP 
dehydrogenase. Based on these results, it was therefore 
assumed that Corynebacterlum glutamlcum has two ORFs having 
the IMP dehydrogenase activity. 

Example 6 

Proteome analysis of proteins derived from Corynebacterlum 
glut ami cum 

(1) Preparations of proteins derived from Corynebacterlum 

glutamlcum ATCC 13032, FERM BP-7134 and FEPM BP-158 

Culturing tests of Corynebacterlum glutamlcum ATCC 
13032 (wild type strain) , Corynebacterlum glutamlcum FERM 
BP-7134 (lysine-producing strain) and Corynebacterlum 
glutamlcum (FERM BP-158, lysine-highly producing strain) 
were carried out in a 5 1 jar fermenter according to the 
method in Example 2(3) . The results are shown in Table 6. 
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Table 6 

Strain L-Lysine yield (g/1) 



ATCC 13032 0 
FERM BP-7134 45 
FERM BP-158 60 



After culturing, cells of each strain were 
recovered by centrifugation . These cells were washed with 
TriS-HCl buffer (10 mmol/1 Tris-HCl, pH 6.5, 1,6 mg/ml 
protease inhibitor (COMPLETE; manufactured by Boehringer 
Mannheim)) three times to give washed cells which could be 
Stored under freezing at -80°C The freeze-stored cells 
were thawed before use, and used as washed cells. 

The washed cells described above were suspended in 
a disruption buffer (10 mmol/1 Tris-HCl, pH 7.4, 5 mmol/1 
magnesium chloride, 50 mg/1 RNase, 1.6 mg/ml protease 
inhibitor (COMPLETE: manufactured by Boehringer Mannheim)), 
and disrupted with a disruptor (manufactured by Brown) 
under cooling. To the resulting disruption solution, DNase 
was added to give a concentration of 50 mg/1, and allowed 
to stand on ice for 10 minutes. The solution was 
centrifuged (5,000 x g, 15 minutes, 4°C) to remove the 
undisrupted cells as the precipitate, and the supernatant 
was recovered. 

To the supernatant, urea was added to give a 
concentration of 9 mol/1, and an equivalent amount of a 
lysis buffer (9.5 mol/1 urea, 2% NP-40, 2% Ampholine, 5% 
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mercaptoethanol , 1.6 mg/ml protease inhibitor (COMPLETE; 
manufactured by Boehringer Mannheim) was added thereto, 
followed by thoroughly stirring at room temperature for 
dissolving. 

After being dissolved, the solution was centrifuged 
at 12,000 x g for 15 minutes, and the supernatant was 
recovered . 

To the supernatant, ammonium sulfate was added to 
the extent of 80% saturation, followed by thoroughly 
stirring for dissolving. 

After being dissolved, the solution was centrifuged 
(16,000 x g, 20 minutes, 4°C) , and the precipitate was 
recovered. This precipitate was dissolved in the lysis 
buffer again and used in the subsequent procedures as a 
protein sample . The protein concentration of this sample 
was determined by the method for quantifying protein of 
Bradford. 

(2) Separation of protein by two dimensional 
electrophoresis 

The first dimensional electrophoresis was carried 
out as described below by the isoelectric electrophoresis 
method . 

A molded dry IPG strip gel (pH 4-7, 13 cm, 
Immobiline DryS trips ; manufactured by Amersham Pharmacia 
Biotech) was set in an electrophoretic apparatus (Multiphor 
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II or IPGphor; manufactured by Amersham Pharmacia Biotech) 
and a swelling solution (8 mol/1 urea, 0.5% Triton X-100, 
0.6% dithiothreitol, 0;5% Aropholine, pH 3-10) was packed 
therein, and the gel was allowed to stand for swelling 12 
to 16 hours. 

The protein sample prepared above was dissolved in 
a sample solution (9 mol/1 urea, 2% CHAPS, 1% 
dithiothreitol, 2% Ampholine, pH 3-10), and then about 100 
to 500 p,g <in terms of protein) portions thereof were taken 
and added to the swollen IPG strip gel. 

The electrophoresis was carried out in the 4 steps 
as defined below under controlling the temperature to 20°C: 
step 1: 1 hour under a gradient mode of 0 to 500V; 
step 2: 1 hour under a gradient mode of 500 to 1,000 V; 
step 3: 4 hours under a gradient mode of 1,000 to 8,000 V; 
and 

step 4: 1 hour at a constant voltage of 8,000 V. 

After the isoelectric electrophoresis, the IPG 
strip gel was put off .from the holder and soaked in an 
equilibration buffer A <50 mmol/1 Tris-BCl, pH 6.8, 30% 
glycerol, 1% SDS, 0.25% dithiothreitol) for 15 minutes and 
another equilibration buffer B (50 mmol/1 Tris-HCl, pH 6.8, 
6 mol/1 urea, 30% glycerol, 1% SDS, 0,45% iodo acetamide) 
for 15 minutes to sufficiently equilibrate the gel . 

After the equilibrium, the IPG strip gel was 
lightly rinsed in an SDS electrophoresis buffer (1.4% 
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glycine, 0,1% SDS, 0,3% Tris-HCl, pH 8.5), and the second 
dimensional electrophoresis depending on molecular weight 
was carried out as described below to separate the proteins . 

Specifically, the above IPG strip gel was closely 
placed on 14% polyacrylamide slub gel (14% polyacrylamide, 
0.37% bisacrylamide, 37.5 mmol/1 Tris-HCl, pH 8.8, 0.1% SDS, 
0.1% TEMED, 0.1% ammonium persulfate) and subjected to 
electrophoresis under a constant voltage of 30 niA at 20°C 
for 3 hours to separate the proteins. 

(3) Detection of protein spot 

Coomassie staining was performed by the method of 
Gorg et al. (Electrophoresis, 9z 531-546 (1988)) for the 
slub gel after the second dimensional electrophoresis . 
Specifically, the slub gel was stained under shaking at 
25°C for about 3 hours, the excessive coloration was 
removed with a decoloring solution, and the gel was 
thoroughly washed with distilled water. 

The results are shown in Fig. 2. The proteins 
derived from the ATCC 13032 strain (Fig. 2A) , FERM BP-7134 
strain (Fig. 2B) and FERM BP-158 strain (Fig. 2C) could be 
separated and detected as spots . 

(4) In-gel digestion of detected protein spot 

The detected spots were each cut out from the gel 
and transferred into siliconized tube, and 400 \xl of 100 
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mmol/1 ammonium bicarbonate : acetonitrile solution (1:1, 
v/v) was added thereto, followed by shaking overnight and 
freeze-dried as such. To the dried gel, 10 \xl of" a 
lysylendopeptidase (LysC) solution (manufactured by WAKO, 
prepared with 0,1% SDS-containing 50 mmol/1 ammonium 
bicarbonate to give a concentration of 100 ng/(j.l) was added 
and the gel was allowed to stand for swelling at 0°C for 45 
minutes, and then allowed to stand at 37°C for 16 hours. 
After removing the LysC solution, 20 p,l of an extracting 
solution (a mixture of 60% acetonitrile and 5% formic acid) 
was added, followed by ultrasoni cation at room temperature 
for 5 minutes to disrupt the gel. After the disruption, 
the extract was recovered by centrifugation (12,000 rpm, 5 
minutes, room temperature). This operation was repeated 
twice to recover the whole extract. The recovered extract 
was concentrated by centrifugation in vacuo to halve the 
liquid volume. To the concentrate, 20 |j.l of 0.1% 
trif luoroacetic acid was added, followed by thoroughly 
stirring, and the mixture was subjected to desalting using 
ZipTip (manufactured by Millipore) . The protein absorbed 
on the carriers of ZipTip was eluted with 5 |J.l of a-cyano- 
4-hydroxycinnamic acid for use as a sample solution for 
analysis . 
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(5) Mass spectrometry and amino acid sequence analysis of 
protein spot with matrix assisted laser desorption 
ionization time of flight mass spectrometer (MALDI-TOFMS) 

The sample solution for analysis was mixed in the 
equivalent amount with a solution of a peptide mixture for 
mass calibration (300 nmol/1 Angiotensin II, 300 nmol/1 
Neurotensin, 150 nmol/1 ACTHclip 18-39, 2.3 nmol/1 bovine 
insulin B chain) , and 1 [il of the obtained solution was 

spotted on a stainless probe and crystallized by 

spontaneously drying. 

As measurement instruments , REFLEX MALDI-TOF mass 

spectrometer (manufactured by Bruker) and an N2 laser (337 

nm) were used in combination. 

The analysis by PMF (pep tide -mass finger printing) 

was carried out using integration spectra data obtained by 

measuring 30 times at an accelerated voltage of 19.0 kV and 

a detector voltage of 1.50 kV under reflector mode 

conditions. Mass calibration was carried out by the 

internal standard method. 

The PSD (post- source decay) analysis was carried 

out using integration spectra obtained by successively 

altering the reflection voltage and the detector voltage at 

an accelerated voltage of 27.5 kV. 

The masses and amino acid sequences of the peptide 

fragments derived from the protein spot after digestion 

were thus determined. 



- 356 - 



(6) Identification of protein spot 

From the amino acid sequence information of the 
digested peptide fragments derived from the protein spot 
obtained in the above (5) , ORFs corresponding to the 
protein were searched on the genome sequence database of 
Corynebacterium glutamicvm ATCC 13032 as constructed in 
Example 1 to identify the protein . 

The identification of the protein was carried out 
using MS-Fit program and MS-Tag program of intranet protein 
prospector . 

(a) Search and identification of gene encoding high- 
expression protein 

In the proteins derived from Corynebacterlum 
glutamicum ATCC 13032 showing high expression amounts in 
CBB-staining shown in Fig. 2A, the proteins corresponding 
to Spots-1, 2, 3, 4 and 5 were identified by the above 
method . 

As a result, it was found that Spot-1 corresponded 
to enolase which was a protein having the amino acid 
sequence of SEQ ID NO: 4585; Spot-2 corresponded to 
phosphoglycelate kinase which was a protein having the 
amino acid sequence of SEQ ID NO: 5254; Spot-3 corresponded 
to glyceraldehyde-3-phosphate dehydrogenase which was a 
protein having the amino acid sequence represented by SEQ 
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ID NO: 5255; Spot-4 corresponded to fructose bis^phosphate 
aldolase which was a protein having the amino acid sequence 
represented by SEQ ID NO: 6543; and Spot-5 corresponded to 
triose phosphate isomerase which was a protein having the 
amino acid sequence represented by SEQ ID NO: 5252. 

These genes, represented by SEQ ID NOS:1085, 1754, 
1775 , 3043 and 1752 encoding the proteins corresponding to 
Spots-1, 2, 3, 4 and 5, respectively, encoding the known 
proteins are important in the central metabolic pathway for 
maintaining the life of the microorganism. Particularly, 
it is suggested that the genes of Spots-2, 3 and 5 form an 
operon and a high -express ion promoter is encoded in the 
upstream thereof <J\ of Bacteriol. r 274: 6067-6086 (1992)). 

Also, the protein corresponding to Spot-9 in Fig. 2 
was identified in the same manner as described above , and 
it was found that Spot-9 was an elongation factor Tu which 
was a protein having the amino acid sequence represented by 
SEQ ID NO: 6937 , and that the protein was encoded by DNA 
having the nucleotide sequence represented by SEQ ID 
NO: 3437. 

Based on these results, the proteins having high 
expression level were identified by proteome analysis using 
the genome sequence database of Corynebacterlum glutamlcum 
constructed in Example 1 : . Thus, the nucleotide sequences 
of the genes encoding the proteins and the "nucleotide 
sequences upstream thereof could be searched simultaneously . 
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Accordingly, it is shown that nucleotide sequences having a 
function as a high-expression promoter can be efficiently 
selected. 

(b) Search and identification of modified protein 

Among the proteins derived from Corynebacterium 
gltttrnni cum FERM BP-7134 shown in Pig. 2B, Spots-6, 7 and 8 
were identified by the above method. As a result , these 
three spots all corresponded to catalase which was a 
protein having the amino acid sequence represented by SEQ 
ID NO: 3785 . 

Accordingly, all of Spots- 6 f 7 and 8 detected as 
spots differing in isoelectric mobility were all products 
derived from a catalase gene having the nucleotide sequence 
represented by SEC ID NO:285, Accordingly, it is shown 
that the catalase derived from Coryn eba c t eri ijm glutamic cum 
FEFW BP-7134 was modified after the translation. 

Based on these results, it is confirmed that 
various modified proteins- can be efficiently searched by 
proteome analysis using -the genome sequence database of 
CozryziGha. <z £er\i urn gl u fcami cum constructed in Example 1 . 

Cc) Search and identification of expressed protein 
effective in lysine production 

It was found out that in Fig. 2A <ATCC 13032: wild 
type strain), Fig. 2B * (FERM BP-7134: • lysine -producing 
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strain) and Fig. 2C (FEKM BP-158: lysine-highly producing 
strain) , the catalase corresponding to Spot- 3 and the 
elongation factor Tu corresponding to Spot- 9 as identified 
above showed the higher expression level with an increase 
in the lysine productivity. 

Based on these results, it was found that hopeful 
mutated proteins can be efficiently searched and identified 
in breeding aiming at strengthening the productivity of a 
target product by the pr oteome analysis using the genome 
sequence database of Coxynebacterium cflutamlaum constructed 
in Example 1 - 

Moreover , useful mutation points of useful mutants 
can be easily specified by searching the nucleotide 
sequences (nucleotide sequences of promoter, ORF, or the 
like) relating to the identified proteins using the above 
database and using primers designed on the basis of the 
sequences. As a result of the fact that the mutation 
points are specified, industrially useful mutants which 
have the useful mutations or other useful mutations derived 
therefrom can be easily bred. 

While the invention has been described in detail 
and with reference, to specific embodiments thereof/ it will 
be apparent to one of skill in the art that various changes 
and modifications can be made therein without departing 
from the spirit and scope thereof. All references cited 
herein are incorporated in their entirety. 
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